?
Abstract
Big data in healthcare has key significant importance as it is used to predict the outcome of diseases prevention, mortality as well as saving of medical treatments. Big data is become a database where the information produced is used for treatment in addition to supervision of the diseases. Value based programs are incenting the health care providers to investigate new methods to leverage health data for quality measurement as well as efficiency of health care. Into the propose sector, privacy of patient’s data is a key focus of the healthcare providers. Based on market adoption, big data revolution into the healthcare domain is at early stage with potential to create value and business development. Trend into value based health care delivery is fostering collaboration of stakeholders to make sure that it provides value to the patient’s treatment and privacy of healthcare data and information. The study is used of systematic review methodology for creating a categorization of the big data use into the healthcare. This paper studies an analysis of healthcare data using big data in healthcare sector.
Keywords: Big data, patient’s privacy, healthcare domain, healthcare data
?
Acknowledgement
Conducting this research has been one of the most enriching experiences of my life. The contribution of this research to enhance my knowledge base and analytical skill has been paramount. It gave me the opportunity to face challenges in the process and overcome them. This would not have been possible without the valuable guidance of my professors, peers and all the people who have contributed to this enriching experience. I would like to take this opportunity to thank my supervisor _________________________ for the constant guidance and support provided to me during the process of this research. It would not be justified if I did not thank my academic guides for their important and valuable assistance and encouragement throughout the research process. I would also like to thank my friends who had provided me with help and encouragement for collecting primary data and valuable resources. The support of all these people has been inspiring and enlightening throughout the process of research in the subject.
Heartfelt thanks and warmest wishes,
Yours Sincerely,
?
Table of Contents
Chapter 1: Introduction 7
1.1 Epistemology 8
1.2 Research questions 8
1.3 Impacts and significance 9
1.4 Personal reflexivity 10
1.5 Structure and overview 10
Chapter 2: Background 12
2.1 Definition of terms 12
2.2 Big data evolution in healthcare 12
2.3 Ways to leverage of big data 13
2.4 Big data applications for healthcare 14
2.5 Role of big data analytics into the healthcare industry 14
Chapter 3: Literature review 15
Chapter 4: Theory 15
4.1 Big data analytics theories in healthcare sector 15
4.2 Theoretical framework of the study 16
4.3 Social representation theory 17
Chapter 5: Methodology 21
5.1 Introduction 21
5.2 Research Philosophy 21
5.3 Research Approach 23
5.4 Research Design 24
5.5 Data Collection methods 25
5.6 Data Sampling and Analysis 27
5.7 Ethical Consideration 29
5.8 Limitations 29
Chapter 6: Outputs and contributions 30
6.1 Outputs of the study based on data collection 30
6.2 Impact of contributions 52
6.3 Limitations of the study 52
References 54
Appendix 60
1. Survey questionnaire 60

?
Chapter 1: Introduction
The research is based on analysing healthcare data with big data. The technologies of big data are made impacts in the fields related to the healthcare such as medical diagnosis from imaging of data into medicine, quantifying the lifestyle data into the healthcare industry and others. Raghupathi and Raghupathi (2014) stated that it is evidence that in existing mounds of the big data, there is hidden knowledge which changes life of a patient and changes the world itself. The big data technologies have the potential to unlock the productivity bottlenecks and also improve the quality as well as accessibility of the healthcare system. Big data has consisted of a wide range of definitions into the healthcare research. Into the healthcare industry, big data is encompassed of higher volume, higher diversity, environmental as well as lifestyle information gathered from the single individuals to larger cohorts in the relation of the health (Agarwal ; Dhar, 2014). The concept of big data is not new, but it is a way to define the changes in the healthcare. Healthcare is a prime example of three Vs of the data such as velocity, variety as well as volume. In order to add of Vs, the veracity of the healthcare data is critical towards the development of transactional research. Medical imaging is provided of information on the anatomy as well as organ function to detect diseases states (Archenaa ; Anita, 2015). As the size of data is increased, understanding of dependencies among data and designing of accurate methods demand the new computer-aided techniques as well as platforms.
Costa (2014) argued that there is rapid growth into the healthcare organizations and number of patient’s results in greater use of the computer-aided medical diagnostics as well as decision support system into the clinical settings. Integration of computer analysis with proper care has the potential to help the clinicians into the improvement of diagnostic accuracy. Integration of medical images and electronic healthcare record improves efficiency as well as reduction of time taken for the purpose of diagnosis (Bates et al., 2014). From the big data point of view, medical imaging is being reviewed. Medical imaging is encompassed by a wider spectrum of various image acquisition methodologies. The goal of this medical imaging is to make an improvement over depicted contents (Chen, Chiang, & Storey, 2012). The framework developed to analyse as well as a transform of larger datasets is Hadoop which employed of MapReduce. It is such a programming paradigm which is provided of scalability across servers into a Hadoop cluster with real-world applications. It is such a framework which is used to increase the speed of three large-scale medical image processing. Into area of application for instance into the public healthcare, the Big data value chain is compromised to collect as well as gather of data, process and store at distribution as well as an assessment of precise data (Jee & Kim, 2013). In the current research approach, there is the utilization of data concerning the individual health to gather data throughout the time of monitoring in addition to diagnosis.
1.1 Epistemology
The goal of this research study is to provide a list of instances of big data which are used for prospective incorporation in different councils. The researcher is suggested of key significant priorities associated with the big data concerning the practice of public health as well as healthcare. Additional values are analysed with assistance to the maintenance of the healthcare processes, enhancement of quality as well as the efficiency of treatment, fighting with the chronic diseases and assisting with a lifestyle that acts with key factors for chronic (Singh & Reddy, 2015). It is also provided of the list of suggestions with an objective to provide guidelines for developing of big data value chain.
1.2 Research questions
The proposed research study is explored of complex space from the research perspectives and generated by design as well as a culture while bridging the gaps. The study is developed of the set of research questions which is aligned with literature as well as theoretical work. The research questions for proposed research are as follows:
RQ1: What are the instances for the utilization of big data in the telemedicine, healthcare and public health practices?
The proposed research question is aimed to study of big data into the public health, telemedicine and healthcare and identified of examples to use of big data into healthcare. The examples of big data in healthcare are identified by systematic literature review. Based on an assessment of value-added and quality of the evidence, the big data is evaluated.
RQ2: What are the additional values concerning the sustainability of the health process, enhancing the quality and the efficiency of the treatment, fighting the chronic disease and assistance of a healthy lifestyle can bring in?
The proposed research question is based on the estimation of the environmental burden of diseases; environmental health interactions can support design more effectively. The environmental factors can help the policymakers in improving quality of life as well as efforts into the assistance of a healthy lifestyle.
1.3 Impacts and significance
Into coming decades, the healthcare is predicted to grow at an unprecedented rate and therefore the data is being associated. It becomes a challenge for the industry to analyse the amount of data and turn into actionable medical sights. Introduction of big data into the healthcare with cloud computing is provided in a new direction to the medical models. With the involvement of the cloud, the healthcare industry is capable of uploading of more information, while big data analytics on insights are related to the data (Roski, Bo-Linn, & Andrews, 2014). It is provided by the progressive path towards the healthcare sector. The data is produced on a daily basis by the hospitals and medical operations. The key significance of big data is brought up sophisticated methods to consolidate information from the sources. The focus of this study is to provide relevant as well as updated information to the doctors in real time while it is consulted with the patients. The cloud based big data is stored and analyzed of data from resources (Wang, Kung, & Byrd, 2018). The medical data of patients are personal and required to protect as well as security against losses. With improved computing technology, big data into the healthcare sector and its privacy are at the highest priority.
1.4 Personal reflexivity
With reflexivity into part of the investigator is well established as a tool into the qualitative research, it is a key significant requirement because of interdisciplinary nature of this study. Due to convergence nature of this study, I have addressed the goals and research questions. I have examined that there is fast expanding field into the big data analytics is started to play a role in the evolution of the healthcare practices (Chen et al., 2017). It is also provided of tools for managing and analysing larger volumes for unstructured data by the healthcare system. I have applied big data analytics to aid process towards delivery exploration. Potential areas of research in this field can provide an impact on the healthcare. Medical imaging is encompassed by a wider spectrum of various image acquisition methodologies used for the clinical applications (Lo’ai et al., 2017). I have estimated that volume, as well as a variety of the medical data, is analyzing the big challenge. It is advanced into medical imaging make individualized care practical and provided of qualitative information in various medical applications. Huge space is required for storage of data in addition to analysis, findings with map along with dependencies among various data.
1.5 Structure and overview
The proposed outline of this research study is based on addressing the research questions and analysing the healthcare data with big data. There are some sections of this study such as:
Introduction: This section is introducing of phenomenon which is investigated both to situate the qualitative research and complete understanding of the research study. This section is provided of research context and state-of-the-art.
Background: This section is provided of impacts of big data into the healthcare sector. As higher volume of data is increased day-by-day into the internet world, therefore big data becomes popular in the market. In this paper, there is demonstration of big data into the healthcare industries which step into big data pool to take benefits of advanced tools and also technologies.
Literature review: Following the literature review, this research is introduced of theoretical perspectives into present research. The medical framework is discussed of theoretical constructs as well as literature to collect data and analyse into research questions.
Theory: This paper is presented of issues which are faced by the healthcare systems with the use of big data technologies. Further, the paper is provided of theories and applications to implement by use of big data in the healthcare.
Methodology: The paper is presented of various methods to be used for the healthcare data analytics that help into better decision making to raise business value as well as customer interest. Big data techniques are applied to developed systems for earlier diagnosis of the chronic diseases and development of integrated data analytics platforms.
Anticipated timelines: The timeline is provided for the total time required to complete the entire research study.
Outputs and contributions: This particular section is discussed of anticipated results throughout critical analysis of the impact, limitations as well as justifications of the research significance.
Chapter 2: Background
This particular paper is provided of explanation of relationship among big data as well as healthcare data. The big data is assisted in aiding of the healthcare data. The initial review is done with view to explore of how big data is applied to get maximum benefits from targeted research.
2.1 Definition of terms
There are various key terms which are used in this research study. Terms are drawn from wider range of the literature which has informed this work and presented to navigate field specific work in following sections:
Device monitoring: Capture as well as analyse into real time larger volumes of fast moving data from the in-hospital devices for safety monitoring and prediction of the event.
Volume: There is management of amount of data which is referred to terms of terabytes and also petabytes of the data. It is involved with managing of data storage.
Variety: The format of data is being structured, semi-structured as well as unstructured.
Velocity: There is frequency of data which is produced, processed as well as analyzed.
As per me, definition related to the big data terms are referred to various tools which are used to store as well as analyse of larger volumes of data.
2.2 Big data evolution in healthcare
Before people know what computer is, Dr. Lawrence Weed is developed of EHR system. Storage, as well as mainframes, is expensive, and the hospitals are shared with technology. It is a long term before EHR is adopted by each hospital in nation. With recognition of importance of EHR to improve the healthcare system, health information technology was introduced in the year 2009 for encouragement of organizations for adoption of electronic records. It was being implemented to improve patient security and also boost effectiveness, increasing of penalties for non-compliance and improvement of data breach notifications. Fixes are included with increased protection as well as control of patients over own data. The hospitals adhere to data security guidelines as well as face for non-compliance consequences (Gandomi ; Haider, 2015). It is evolved to require of physical as well as technical safeguards to keep patient data secured. Despite concerns about patient privacy, there is no such deny that the healthcare data are playing of important role in patient care.
2.3 Ways to leverage of big data
The big data is unlocked of significant value by making the information more transparent. As the organization is created and stored transactional data into the digital form, the big data is allowed for narrow segmentation of the customers as well as there are precise, tailored products and services. Sophisticated analytics are improved in decision making, minimizing risks and valuable insights which would remain hidden. Big data provide the power of insights are created by favourable and meaningful outcomes from larger data sets helped into decision making and maximized of business impacts (Eswari, Sampath, ; Lavanya, 2015). People, process along with analytic tools are needed to have an impact on the potential business benefits. The big data can help in improving outcomes and reduction of medical errors. Application of big data tools is facilitated with evidence based care which is being personalized to particular patients. Cost effective healthcare like patient outcomes reimbursement and also elimination of fraud in addition to abuse into the system utilized of big data. There is improvement of healthcare outcomes by offering timely insights to the care providers that can administer of effective treatments. By use of secured data technologies, it is capable of parsing of amounts of information.
2.4 Big data applications for healthcare
There is an analysis of the healthcare sector which shows that big data application is existed to aim towards aligning with improved quality, implies of increase in the cost of healthcare and need to improve the efficiency of care. There are big data applications, a fact which is required a mean to describe as well as align of data sources, means to make sure of higher data quality, means to address of data privacy and also security. It is meant for the data analytics on the integrated datasets. In order to compliant with higher data security as well as privacy, needs are required to protect of sensitive nature of health data (Archenaa ; Anita, 2015). Apart from this, there is higher standardized of documentation, and also systematic analysis of the health along with outcome of data for particular patient population is needed. In order to achieve this, there are larger datasets encompassed with the clinical, administrative as well as financial data are identified as clinically most efficient (Roski, Bo-Linn, ; Andrews, 2014). The big data analytics have some potential for transforming way the healthcare providers are used of sophisticated technologies to gain insights from the clinical and data repositories along with making of informed decisions.
2.5 Role of big data analytics into the healthcare industry
According to Chen et al., (2017), 80 percent of the healthcare information is unstructured data that is so vast along with complex which is required of specific methods as well as tools to make of meaningful use of big data. New as well as emerging technologies along with predictive analytics are bringing new tools for the healthcare technologists and leaders to capture of health related data and processes for complete transformation of the industry. The big data analytics is playing a role in medicine by building of better health related profiles and better predictive models around individual patients to help better diagnosis along with treatment of diseases.
Chapter 3: Literature review
(As per update by the student, please insert the literature review from the proposal)
Chapter 4: Theory
4.1 Big data analytics theories in healthcare sector
Raghupathi and Raghupathi (2014) discussed that there are some data mining theories which are applied into the healthcare sector. Zhang et al., (2017) stated that data mining plays a key role into data extraction from the database for analyzing which is convenient to for integration of data. There is designed of new healthcare management model based on cloud platform. The healthcare system can manage the healthcare networks as well as intervenes before happening of diseases. Using the inductive learning theory, it is belonged to the category of machine learning which can extract general rules as well as patterns by summarization of the experienced healthcare data.
Wang, Kung and Byrd (2018) proposed of clustering theory which is applied to target the individuals for preventive measures using attributes such as age, height in addition to weight. Regression analysis is done to analyze effects of the proposed radiation treatment and reduction of size of tumour. Review of the literature suggested that the big data analytics applications are useful into the healthcare industry. Dimitrov (2016) suggested that private health insurers used of big data for analyzing the claims of health insurance to detect the frauds and errors into the information system. It helps the organization to identify hidden costs that a traditional processing system is not helped to detect any frauds and human errors. In order to improve energy efficiencies into the healthcare sector, Manogaran et al., (2017) provided examples like data manipulation to analyze the use of big data analytics into the healthcare sector. Patil and Seshadri (2014) discussed of healthcare applications which help in segmentation in addition to predictive modelling to profiles of patients help in precautions process.
4.2 Theoretical framework of the study
The theoretical framework of this study is explaining the variables which are taken to understand the concepts of big data and its assistance to the healthcare data. The variables are explained into the below figure which have been explained in the literature review in order to provide extensive idea about Big Data.

Figure 4.1: Theortical framework of the research study
(Source: Created by author)
From the above figure, it is seen that healthcare data is an independent variable, bi data transformation is meditating variable. There are three dependent variables as per the theoretical framework such as added value of big data in health practices, derivation of policy actions for big data in health and big data in healthcare. As the healthcare industry is larger, therefore combination of healthcare data and big data is also large; therefore it becomes difficult for the company to manage the data. Hadoop data approach is one of the key choices that can go along in the current trends. This data processing is encouraged to provide mathematical processes (Wang et al., 2018). The data sources aids to process in addition to assess end outcomes of data with various features.
The hypothesis of this particular paper is looked to explain the conjectures that are accepted as well as rejected with help of answers discovered with help of the selected research hypothesis. The research hypothesis is as following:
H0: Big Data is not useful for the assessment of the healthcare data
H1: Big Data is useful for the assessment of the healthcare data
4.3 Social representation theory
Bhatt, Dey and Ashour (2017) stated that there is use of social representation theory (SRT) as the method which guides development of the theoretical framework. SRT allows the study of the social dynamics but it is informed that the structure of theoretical framework is based on the research requirements. This paper is mainly discussed on development of theoretical framework for studying the influence of the big data on the business IT alignment based on the United States healthcare sector. The process to develop such a theoretical framework is applied into the healthcare systems in US. Such a framework is used as basis for the empirical studies that aim to develop the theory (Riggins ; Wamba, 2015). The literature study is consisted of big data, healthcare data, highlighting identified gaps, while SRT is being discussed. There is description of healthcare system which is provided into the research study. The aim of this paper is to analyze the framework in literature that is explored influence of big data into the healthcare sector.
This paper is adopted of research approach which aims to develop the theoretical framework using SRT into the social dynamics study which is associated with implementation of big data analytics into the healthcare sector (Storey ; Song, 2017). SRT is such a theory which provides holistic posture for understanding the study in social groups. SRT is used on various fields such as social sciences, change into organization, implementation of IT, healthcare in addition to information security. Tsai et al., (2016) defined that social representation is elaboration of the social objects for the point to act and converse. Zakir, Seymour, ; Berg (2015) outlined SRT as a framework used for this research study for social production. It is provided with conceptual tools to address social context and capture of sequential nature of the social activities. It is explained that big data is probable to make changes into aspects of the traditional businesses. Therefore, it is analyzed that SRT is proper approach to guide alignment study of the big data in social dynamics (Rajaraman, 2016). Data analytics has key significant revolution into the healthcare in last decades. With adoption of electronic health record as well as digital tools, there is required of structured as well as unstructured data to process and analyze. The analytical tools can impact on health systems to make improvement over the clinical outcomes. The cloud and IT infrastructure are allowed larger quantities of the data processing in real time. The healthcare data are meeting with definition of the big data (Kambatla et al., 2014). The theories suggested that research on the healthcare data is popular as it is required for sake of the healthcare providers and patients. The researcher analyzed that there is huge usage of the big data into telemedicine, healthcare and public health practices which helps to add value to the health processes (Aggarwal, Mishra, ; Bhatnagar, 2018). It enhances quality and the efficiency of the treatment, fighting the chronic disease and assistance of healthy lifestyle.
The healthcare system has numerous disparate as well as continuous monitoring devices which are utilized of discredited critical information for providing of alert mechanisms in case of events (Najafabadi et al., 2015). There are various developments into the healthcare sector which escalate of the healthcare costs, increase into healthcare coverage and also shift into provider reimbursement trends along with trigger of the demand of the big data technology. The theories related to big data contributed to increase into profits along with cut down the wasteful overheads, big data is an application with no such healthcare industry to predict diseases and improvement over quality of life. GarcĂ­a et al., (2016) discussed that the decisions which are made up to the changes are driven by the data. It is focused on solely to understand the patients as early in life as probable. As there is strengthen of technology, it holds healthcare sector that the data sources as well as volumes are available for the research at same pace.
Gandomi and Haider (2015) argued that big data provides better insights from expansion of volumes as well as data sources. Big data analytics and IoT is revolutionized way to track different user statistics. Gandomi and Haider (2014) noted that there is continuous monitoring of body vitals with the sensor data collection. It allows the healthcare organizations to keep people out of hospital as they can able to identify health related issues as well as provide care before worsening of situations. The healthcare industry is benefited from advanced analytics as well as big data technologies. Reyes-Ortiz, Oneto, and Anguita (2015) concluded that predictive analytics promotes quality care as well as safety of patients. Readmission rates are considered as chronic problems into the hospitals, for the patients those are returned within 30 days of healthcare treatment. The big data analytics reveals the trends which highlight that the patients are required of treatment to prevent readmissions. Data security is a target for the cyber thieves as it yields personal information which is valuable than the credit card data (Dagade et al., 2015). Big data analytics is a valuable resource to secure of medical records. Information and communication technology plays a key role to improve health care for the individuals and helps to make improvement over health systems in addition to eliminate of medical errors. Big data analytics is process to explore huge data sets with variety of data to reveal the hidden patterns and market trends.

?
Chapter 5: Methodology
5.1 Introduction
Research and development has proven to be the most crucial milestone for the human development. Hence, to support the development of the human several researches are being conducted on different aspects, zones and industries. It should also be noted that the difference in subject deems need for different research design and methodology. According to Mackey and Gass, (2015), the research on a particular subject could only pose adequate results if appropriate research design and methodology is selected. Hence, based on the subject of the discussed paper and its relevance, the research design and methods for the work has been selected. The research philosophy for the discussed research work is the Post-Positivist while the research design is exploratory and the research approach is deductive. The data for the discussed research work is primary quantitative in nature and has accounted for probable sampling technique. Data analysis has been done through regression analysis and the population for the data were the residents of the Jacksonville city. A sample size of 50 respondents was collected to collect the data. The sections following offer a clearer view of all the aspects of the research design and methodology before concluding the discussed section.
5.2 Research Philosophy
Research philosophy details the belief that the scholars or academic researchers adopt to pursue a research work (Hughes ; Sharrock, 2016). The belief reflects the approach that needs to be adopted for the collection, analysis and use of the data that are vital for concluding on a research topic which makes it crucial for the selection of most appropriate research philosophy. The most prominently used research philosophies are the positivism and interpretivism which are suitable for quantitative and qualitative research works (Cowling, 2016). However, with time several other research philosophies has established themselves and are utilised for pursuing a research objective. One of the most used research philosophy is the post-positivism which has defined the path of completion for the research work in discussion.
Positivism believes that the subject of the paper can be discussed from objective viewpoint without interrupting the subject or its environment (Hasan, 2016). However, several researchers contradicted with the belief and post positivism was born. The selected philosophy believes that the subject that is being researched over could be influenced by the approach, observation, values and other factors that are core to the researcher. Hence, the discussed philosophy pursues objectivity but takes account of the possible effects that may be evident due to the research work. The practical approach of the philosophy makes it one of the widely used philosophy and adopts method that are measurable, based on approaches accepted by the scientific world and is even highly organised (Dedeurwaerdere, 2018). The key features of the discussed research philosophy are scientific, objectivity, robust and similar natured features. Furthermore, the philosophy supports test hypotheses, identifying cause, need for collection of fact and quantitative data.
Justification
The selection of the discussed research philosophy could be supported by the fact that the aim of the paper is to identify the feasibility of big data implications in the healthcare industry. The subject of the paper is technological and could be well-supported by a scientific research approach. Furthermore, the objective of the paper aimed at identifying the feasibility of the big data in healthcare industry which does not concern the human directly and could be easily accounted by the quantitative data type. Hence, the key needs of the research work are scientific nature and quantitative data which is very well-supported by the post-positivism research philosophy and hence the selection.
5.3 Research Approach
Research approach of a paper is aimed at detailing the approach that is being adopted by the researcher to reach the objective of the research work. Deductive and inductive are the most dominant research approach that is being pursued by researchers and scholars. The deductive approach follows the ideology of T-H-O-C while inductive follows the O-P-H-T approach (Greenfield et al., 2015). The ideology of T-H-O-C stands for theory, hypothesis, observation and confirmation. In the discussed ideology, the first step is to discuss the theory of the research subject which is followed by making an assumption or hypotheses. The following steps are then based on the hypothesis. Post formulation of the hypotheses the researcher observes the subject to validate the hypotheses. Post validation or rejection of the hypotheses, a confirmation is provided to complete the process (Gottfredson ; Aguinis, 2017). The discussed research approach is suitable for the research works where deduction of ideas, patterns or others is expected.
On the contrary, inductive approach defines the ideology of O-P-H-T that stands for observation, pattern, hypotheses and theory (Eisenhardt, Graebner ; Sonenshein, 2016). Inductive approach first observes the subject and identified pattern to present hypotheses. Post presenting the hypotheses, the researchers present the theory. Hence, it would be justified to state that the discussed approach is more dedicated to the development of new ideas, theories, models and others to offer assistance for the subjected scenario, object or individual.
Justification
The research work in discussion has pursued the deductive research approach. The selection could be supported by the compatibility of the research aim and the discussed research approach. The objective of the paper is to deduce the feasibility of the big data in the healthcare industry for which theory has been discussed based on which the hypotheses were proposed. Hence, based on the compatibility between the research objective and the research approach, the selection could be justified.
5.4 Research Design
Research design of a paper refers to the arrangement of the conditions or collections that are part of the research work. It is one of the most crucial aspects of academic research work because it ensures that the obtained evidences would enable the researcher to address the problem with efficiency and effectiveness. Several designs are available which could be adopted to successfully address the research questions of the paper. The research design selected for the discussed research work is exploratory. The discussed research approach is adopted when there are no or little scholarly works done on the subject before. Hence, the design aims to gain an insight of the topic of discussion which makes it suitable for gaining a background knowledge on the topic (Ioannidis et al., 2014). Furthermore, the discussed design offers flexibility and is capable of addressing different types of research questions such as how, what, why and others. It also avails the opportunity for defining of new terms and offer clarification for the existing concepts. The research works following the discussed design also offers opportunities and directions for future researches.
Justification
The selection of the discussed research design could be supported by the fact that the discussed research design is suitable for research works with small data sample which is compatible for the discussed research work. Furthermore, the research design is even compatible with other features of the research work such. One of the compatible features is that the topic in discussion has been discussed very little or no research work of the past. Additionally, the aim of the paper is to clarify the feasibility of the big data in the healthcare industry which is also greatly supported by the research design. Hence, the discussion above could be taken into consideration to justify the selection of exploratory design as the research design for the discussed research work.
5.5 Data Collection methods
The data collection process is one of the key processes of the research work because data are the most prominent variable of the research work. The findings of the paper Data collection for research works are primarily done through two processes namely; primary and secondary data collection (Chidlow et al., 2015). The primary data refers to the data that is collected by the researcher themselves. It can further be classified into two categories namely; qualitative data and quantitative data. The primary qualitative data refers to the data that is collect by doing an in-depth analysis or observation of the subject or topic. Interviews, observations, group discussions and others are the primary means for collecting the qualitative data. The discussed data type is suitable for the research work that is pursuing interpretivism research philosophy. The discussed data collection offers open end questions which offer the respondent with the opportunity to expand their scope and offer a detailed response (Hurst et al., 2015). However, it is also one of the complicated scenarios because the respondent may get confused over the answer and may answer something that is ineffective or vague or unworthy for the research purpose. Furthermore, the sample size of the discussed collection process is very small which could not be accounted for a broad topic such as the Big data in healthcare industry.
The second type of primary data collection method is primary quantitative data collection that is aimed at having a broader sample size in context to its other primary data counterpart. The discussed data collection method believes in closed end questions that ensure that the findings from the collected data will be in context with the research work and in the process will also offer ease for the respondents (Suhonen et al., 2015). The discussed data is collected by means of the surveys, polls and other similar activities. The closed end questions and limited number of questions attract more respondents. Additionally, a broader size of sample is vital and makes the outcome more reliable. Furthermore, the discussed data collection is also simple, efficient and effective for the researchers. However, the sample size of the discussed data is small in context to the secondary data collection.
Finally, the secondary data refers to the data that has been collected earlier by some other entity (individual, body, organisation and others). It refers to the data that are collected from the newspapers, scholarly works, gazetted publications and other reliable secondary source (Palinkas et al., 2015). The discussed data offers the biggest sample size of all the data collection processes. However, certain issues are associated with the discussed data collection method. One of the issues in discussion refers to the collection of data because most of the data collected from secondary sources are available online and are paid versions which make it difficult for academic researchers to collect the data. However, the most prominent issue that is associated with the discussed data is the fact that the technology is a disruptive development and is continuously in flux which might prove the secondary data to be invalid (Wilson, McCarthy ; Dau, 2016). The discussed point could be understood by the fact that the security concerns of transmission that are registered in secondary data before the introduction of cryptography cannot be trusted because of the change in technology and its security measures.
Selected data collection method and process
Hence, based on the research objective and discussion on the data collection procedure, the paper has been pursued with the primary quantitative data collection and has collected the data through online survey. As part of the survey a questionnaire was distributed among different big data and healthcare professionals of Jacksonville. Their responses were recorded and they were considered as the data for the discussion over the topic over which the findings of the paper are based.
Justification
The discussion above has discussed about the data collection procedures along with their enablers and constraints. It has been identified that the qualitative data could be accounted for the outcome because of its very small sample size and doubts over the response. While the issues of data unreliability and collection is the reason that has omitted the option of secondary data collection. However, the quantitative data offers a reasonable sample size which makes the findings reliable and is simple to collect. Furthermore, the respondents of the questionnaire will be professionals of big data and healthcare which would be authentic and will pose as an observed experiment. Hence, the selection of the quantitative data collection method could be justified.
5.6 Data Sampling and Analysis
`Population
` The population of the country from where the respondents would be collected is chosen from the country where the healthcare data would be assessed. The entire population of the country cannot be taken into consideration and therefore a certain city has been taken into consideration (Costa, 2014). Jacksonville is the city selected for the collection of data. From the considered city the respondents will be selected based on their association with the big data or healthcare or both.
Sample
`The sample has been selected from the city of Jacksonville and a sample size of 50 respondents has been chosen to whom the questionnaires have been forwarded so that the primary data can be collected. The sampling of the collected data will be done based on the probable sampling technique. The discussed sampling technique supports the believe that all responses from the individuals who belong to same geographical domain and are discussing a common topic could be assessed on a common ground (Bonawitz et al., 2014). Furthermore, as the questions for the collection of data are closed end whose responses and are clear but limited also supports the probable sampling technique and hence, the selection.
Data Analysis
Multiple data analysis tools and techniques are available for the analysis of the collected research data. The quantitative data analysis is involved with the determination of the relationship between the scores of the various responses and their perception of effect of the big data into the healthcare sector which is denoted via use of regression analysis. However, one of the most prominent analysis techniques is the regression analysis. The discussed analysis technique is aimed at estimating relationships between different variables that are relevant to the topic of discussion (Chatterjee ;Hadi, 2015). Regression analysis has been selected for the analysis of the data because of the prominence it offers. The regression model of the hypothesis can be stated as follows:
Dependent variable of hypothesis (y) = A + B* independent variable of hypothesis (x)
Where,
y= dependent variable
x= independent variable
B will be the correlation constant
and A will be the intercept constant.
The analysis has been done using the MS Excel statistical tool.
5.7 Ethical Consideration
The research works are done to mitigate the threat faced by the subject or enhance its capability by identifying, its current status and potential future. Hence, it is of great prominence that adequate ethical consideration is taken to gain sustainable results from the work (Minaya, 2016). Hence, the discussed paper in attempts to collect the data has provided adequate ethical consideration that includes:
i. Informing the respondents about the research aim, objective and scope.
ii. Consensual participation in the survey.
iii. Maintaining the privacy of the respondents.
iv. Avoiding data manipulation and maintaining an unbiased attitude during the research activities.
v. Citing the sources of the data from where the cited data is inspired from.
Other ethical considerations such as avoiding data theft or copyright obligations and similar other measures has also been avoided to maintain the ethicality of the research work.
5.8 Limitations
The sample size of the collected data is 50 and that too from a specific domain and hence, the findings from the paper will not be implacable for the whole world but is rather limited to the region of research. Furthermore, the research work has also been limited by the small budget and limited approach of the researcher. However, the discussed research limitations also offer opportunity for future research works.
?
Chapter 7: Outputs and contributions
7.1 Outputs of the study based on data collection
In this present study, employees of the healthcare industry are taken for measurement to perform the quantitative analysis to investigate and analyze the healthcare data using big data. The total respondents for this study are 50 those are interested to fill up the online survey form; therefore they are taken as sample of the research study.
Section 1: Demographics Information
1. Please indicate the age group
Options No of respondents Total respondents Response percentage
18-24 years 2 50 4%
25-24 years 28 50 56%
35-44 years 12 50 24%
45-54 years 6 50 12%
55 years and above 2 50 4%
Table 7.1: Age group of the respondents

Figure 7.1: Age group of the respondents

Findings:
From the above figure, it is analyzed that most of the respondents participated into the survey are among the age of 25-34 years with a percentage of 56%. 24% of the participants are among 35-44 years. There are 12% of the participants those are 45-54 years and 6% of between 45-54 years and 2% are 55 years and above. Therefore, most of the workers those are working and interested into the survey are middle age.
2. Please select your gender
Options No of respondents Total respondents Response percentage
Male 36 50 72%
Female 14 50 28%

Table 7.2: Gender of the respondents

Figure 7.2: Gender of the respondents
Findings:
From the above figure, it is analyzed that 72% of the respondents are male and 28% of them are female. Therefore, both male as well as female is interested in this research study to provide their feedback on analyzing the healthcare data using big data.
3. Please select your designation
Options No of respondents Total respondents Response percentage
Project manager 7 50 14%
Employee 16 50 32%
IT Manager 18 50 36%
System Analyst 9 50 18%

Table 7.3: Designation of the respondents

Figure 7.3: Designation of the respondents
Findings:
From the above figure, it is analyzed that 32% of the respondents for the survey are normal employee those are not from the senior management level. 36% of the respondents are IT Manager, 14% are Project Manager and 18% are System Analyst. Therefore, the IT managers are more interested in the survey to provide their feedback.
4. Do you process streamlining big data in your industry?
Options No of respondents Total respondents Response percentage
Yes 48 50 96%
No 2 50 4%

Table 7.4: Process streamlining big data in the industry

Figure 7.4: Process streamlining big data in the industry

Findings:
From the above figure, it is analyzed that 96% of the respondents are agreed that their healthcare industry is used of process streamlining big data into the industry whereas, 2% of the respondents are not agreed that their industry process is streamlined of big data. Therefore, in most of the healthcare organization, there is process streamlining big data into the industry.
Section 2: Analysis of Healthcare Data using Big Data
1. How far do you agree that derivation of policy actions of big data is useful for healthcare data assessment?

Options No of respondents Total respondents Response percentage
Strongly Agree 1 50 1%
Agree 24 50 48%
Neutral 23 50 46%
Disagree 2 50 2%
Strongly disagree 0 50 0%
Table 7.5: Derivation of policy actions of big data is useful for healthcare data assessment

Figure 7.5: Derivation of policy actions of big data is useful for healthcare data assessment
?
Findings:
From the above figure, it is analyzed that 48% of the respondents are agreed that the derivation of policy actions of big data is useful for healthcare data assessment. On other hand, 46% of them are neither agree nor disagree with the question. It is seen that the potential policy for implementation of big data tools are used for getting better healthcare from the healthcare providers. The big data tools are extracted useful information and derive of actionable knowledge which assess as well as quantify the errors into healthcare data. Predictive analytics is required as new tool for the healthcare technologists to capture of the data as well as process for completion of transformation into the healthcare industry.
2. How far do you agree that clinic workflows assure confidentiality as it is related to patient communication and billing practices?
Options No of respondents Total respondents Response percentage
Strongly Agree 2 50 4%
Agree 31 50 62%
Neutral 13 50 26%
Disagree 4 50 8%
Strongly disagree 0 50 0%

Table 7.6: Clinic workflows assure confidentiality

?

Figure 7.6: Clinic workflows assure confidentiality
Findings:
From the above figure, it is analyzed that 62% of the respondents are agreed that the clinic workflows assure confidentiality as it is related to patient communication and billing practices. 26% of them are neither agree nor disagree with the question. 8% of them are strongly disagreeing and 2% are strongly agreed with it. With upgrading of the workflows, the clinical practices help to produce a good communication. Reduction of the patient cycle time along with improvement over the patient satisfaction improved the clinical practices and ensures that the patients should get required services to their clinical areas.
3. How far do you agree that satisfaction of patients is a cost effective way to evaluate hospital services?
Options No of respondents Total respondents Response percentage
Strongly Agree 1 50 2%
Agree 34 50 68%
Neutral 11 50 22%
Disagree 3 50 6%
Strongly disagree 1 50 2%

Table 7.7: Satisfaction of patients is a cost effective way to evaluate hospital services

Figure 7.7: Satisfaction of patients is a cost effective way to evaluate hospital services
Findings:
From the above figure, it is analyzed that 68% of the respondents are agreed that satisfaction of patients is a cost effective way to evaluate hospital services. 22% of them are neither agree nor disagree and 2% of them are strongly agreed with it. The healthcare system is based on availability, efficiency as well as cost which are provided to the patients. The study shows that the patients are admitted in different wards of the hospitals should satisfied with quality of the professional services with availability of clinical amenities. The study showed that assessment of the patient’s satisfaction is simple and easier and also cost effective to evaluate the hospital services.
4. How far do you agree that big data analytics in healthcare market is gaining interest because of introduction of personalized healthcare systems and higher quality of healthcare services?
Options No of respondents Total respondents Response percentage
Strongly Agree 1 50 2%
Agree 24 50 48%
Neutral 21 50 42%
Disagree 4 50 8%
Strongly disagree 0 50 0%

Table 7.8: Big data analytics in healthcare market is gaining interest

Figure 7.8: Big data analytics in healthcare market is gaining interest
Findings:
From the above figure, it is analyzed that 48% of the respondents are agreed with the fact that big data analytics in healthcare market is gaining interest because of introduction of personalized healthcare systems and higher quality of healthcare services. 42% of them are agreed with it. The big data analytics is gaining of interest because of introducing healthcare systems as well as demand of the higher quality of the healthcare services. Throughout adoption of the big data, the healthcare payers and also providers are enhanced capabilities by studying the behaviour of patient towards specific treatment; provide them with customized as well as cost effective services.
5. How far do you agree that decreasing cost, availability of big data software and adoption of data analytics are factors driving growth of big data analytics?
Options No of respondents Total respondents Response percentage
Strongly Agree 1 50 2%
Agree 40 50 80%
Neutral 6 50 12%
Disagree 3 50 6%
Strongly disagree 0 50 0%

Table 7.9: Driving growth of big data analytics

?
Figure 7.9: Driving growth of big data analytics
Findings:
From the above figure, it is analyzed that 80% of the respondents are agreed that decreasing cost, availability of big data software and adoption of data analytics are factors driving growth of big data analytics. 12% of them are neither agree nor disagree with it. The factors which drive growth of the big data analytics into the healthcare market are included with increase into demand for the financial analytics into the healthcare, unstructured data, decreasing of the cost as well as availability of the big data software along with adoption of new technologies for the data analytics into the healthcare business transformations.
6. How far do you agree that an electronic health record is best application of big data in healthcare?
Options No of respondents Total respondents Response percentage
Strongly Agree 0 50 0%
Agree 30 50 60%
Neutral 12 50 24%
Disagree 8 50 16%
Strongly disagree 0 50 0%

Table 7.10: Electronic health record is best application of big data in healthcare

Figure 7.10: Electronic health record is best application of big data in healthcare
Findings:
From the above figure, it is analyzed that 60% of the respondents are agreed and 24% of them are neither agree nor disagree with that the electronic health record (EHR) is best application of big data in healthcare. Introduction of the electronic health record with cloud computing is provided with new direction to the medical models. There is an increase into profit with the business intelligence. With use of cloud, the industry becomes capable to upload of more information while the big data insights are related to the data. The electronic health record is best application into healthcare. The patients has own medical records like laboratory test, medical reports, list of medicines and others. EHR is easier to maintain of data as well as they have access to the medical and healthcare data. It predicts the patient’s income to tailor the staffs. It helps to better take of medical along with financial decisions.
7. How far do you agree that machine learning is most accurate algorithms to predict future healthcare trends?
Options No of respondents Total respondents Response percentage
Strongly Agree 0 50 0%
Agree 38 50 76%
Neutral 6 50 12%
Disagree 6 50 12%
Strongly disagree 0 50 0%
Table 7.11: Machine learning is most accurate algorithms to predict future healthcare trends

Figure 7.11: Machine learning is most accurate algorithms to predict future healthcare trends
Findings:
From the above figure, it is analyzed that 38% of the respondents are agreed that machine learning is most accurate algorithms to predict future healthcare trends. The machine learning is used into healthcare analytics platforms. It is such an algorithm which can predict the 30-day mortality as well as obtained of accurate prediction than the existing. The algorithms are allowed the healthcare providers to replace of the clinical tasks. With help of algorithms, it makes the clinical process better and accurate along with handle the care practices with increasing efficiency, then the physicians are free to be focused on complex issues. It is used to handle complex healthcare data effectively to deal with the big data into the healthcare industry. It will make the data storage expensive as well as available to the users.
8. How far do you agree that utilization of big data in the telemedicine, healthcare and public health practices are helpful for healthcare industry?
Options No of respondents Total respondents Response percentage
Strongly Agree 2 50 4%
Agree 33 50 66%
Neutral 14 50 28%
Disagree 1 50 2%
Strongly disagree 0 50 0%

Table 7.12: Utilization of big data in the telemedicine, healthcare and public health practices are helpful for healthcare industry

Figure 7.12: Utilization of big data in the telemedicine, healthcare and public health practices are helpful for healthcare industry
Findings:
From the above figure, it is analyzed that 66% of the respondents are agreed that utilization of big data in the telemedicine, healthcare and public health practices are helpful for healthcare industry. 28% of the respondents are neither agree nor disagree. The big data analytics is used into telemedicine, healthcare and public health practices is evolved to provide insight from larger data sets and improvement of project outcomes while reduction of operational cost. The analytics platforms as well as healthcare solutions are offered greater insights into how the healthcare providers are managed the patient care as well as cost. There is increased into digitalization of the healthcare sector which means that the organization is added terabytes of patient’s data into the data centres.
9. How far do you agree that clinical decision support system help to analyze medical data and provide health practitioners?
Options No of respondents Total respondents Response percentage
Strongly Agree 1 50 2%
Agree 33 50 66%
Neutral 12 50 24%
Disagree 4 50 8%
Strongly disagree 0 50 0%

Table 7.13: Clinical decision support system help to analyze medical data and provide health practitioners

Figure 7.13: Clinical decision support system help to analyze medical data and provide health practitioners
Findings:
From the above figure, it is analyzed that 66% of the respondents are agreed with clinical decision support system helps to analyze medical data and provide health practitioners. The clinical decision support system analyzes the data help the providers to make decisions along with improve the patient care. It also helps to get clinical advises based on the patient related data. The system enables to integrate the workflows, provides of assistance at time of healthcare and offers plan recommendations. There is improvement over the quality of products where the data can help the patients to take active role in health like diet, medication to take control over health. The data can improve medical outcomes, reduction of the medical errors and facilitate of evidence based care personalized to the patients.
10. How far do you agree that quality reporting tools can reduce hospital acquired conditions and patient safety events?
Options No of respondents Total respondents Response percentage
Strongly Agree 1 50 2%
Agree 32 50 64%
Neutral 12 50 24%
Disagree 5 50 10%
Strongly disagree 0 50 0%

Table 7.14: Quality reporting tools can reduce hospital acquired conditions and patient safety events

Figure 7.14: Quality reporting tools can reduce hospital acquired conditions and patient safety events
Findings:
From the above figure, it is analyzed that 64% of the respondents agreed that quality reporting tools can reduce hospital acquired conditions and patient safety events. 24% of them are neither agree nor disagree. Into the healthcare sector, the quality measure tools helps in patient safety events such as taking of adverse drug, injuries from the falls and others. The hospital acquired conditions are costly to the safety of patients. The big data tools help to reduce safety events, and scores of the hospitals on quality.
In order highlight the relationship between the dependent and independent factors associated with each hypothesis are as follows:
Hypothesis 1: Big Data is useful for the assessment of the healthcare data.
The regression model of the hypothesis can be stated as follows:
Dependent variable of hypothesis = A + B* independent variable of hypothesis
Where, A= Intercept of the Model
And B= Coefficient of “relationship between big data and healthcare practices.
The details of the regression analysis conducted on the above mentioned model has here being presented in the table below:
Table 7:15: Summary output of regression analysis conducted for hypothesis 1
Regression Statistics
Multiple R 0.8205
R Square 0.6732
Adjusted R Square 0.6519
Standard Error 0.1908
Observations 50

Table 7.16: ANOVA table of regression analysis conducted for hypothesis 1
ANOVA
df SS MS F Significance F
Regression 3 3.450 1.150 31.590 0.000
Residual 46 1.675 0.036
Total 49 5.125

Table 7.17: Coefficient table of regression analysis conducted for hypothesis 1
Coefficients Standard Error t Stat P-value
Intercept 0.245 0.199 1.228 0.226
Deviation of policies in healthcare 0.171 0.059 2.894 0.006
Big data analytics 0.397 0.078 5.082 0.000
Performance of big data application 0.271 0.063 4.297 0.000

The null and alternative hypothesis associated with hypothesis 1 are as follows:
Null hypothesis (H0): Big Data is not useful for the assessment of the healthcare data
Alternative hypothesis (H1): Big Data is useful for the assessment of the healthcare data
Obtaining the value of the intercept, along with the value of the coefficient and from the above equation such as:
The regression model of the hypothesis can be stated as follows:
Dependent variable of hypothesis = A + B* independent variable of hypothesis
Where, A= Intercept of the Model
And B= Coefficient of “relationship between big data and healthcare practices.
Where A= 0.245= Intercept of the Model
B= 0.839= Coefficient of “relationship between big data and healthcare practices
?
An analysis of the findings of the regression method indicates that the value of the Significant F, for this particular hypothesis 0.000. Now, according to researchers, in case the value of the Significant F, as obtained through regression analysis, is found to be less than 0.05, the null hypothesis has to be rejected essentially. Therefore, it is stated that it has been provide that the statement “Big Data is useful for the assessment of the healthcare data. In this study, there are three independent variables such as deviation of policies in healthcare; big data analytics and performance of big data application while one dependent variable such as big data. The adjusted R-square is used to determine the statistical significance of the variables used in the model. ANOVA test was used in determining whether to reject or accept the null hypotheses. The p-value is less than 0.05 shows that the researcher must reject the null hypothesis and accept the alternative hypothesis.
7.2 Impact of contributions
The proposed research study is contributed in various ways which is based on discussion the impact of healthcare data using big data. With the discussion of the results and outputs from the collected data, it is seen that big data analytics is used of medical data. The cloud based big data is stored and analyzed data from possible data resources. The technology is continued to grow which provide an impact on medical operations in addition to digital platforms such as big data which help the doctors to stay on top of patient’s information.
7.3 Limitations of the study
The limitations of this study are that this study is based on only primary data collection method due to time constraints. The study is conducted to identify the factors related to the healthcare data using big data. There are two constraints into the research work such as time as well as budget which permitted the researcher to evaluate information based on the secondary data sources included of newspaper and article, journals. Based on literature review as well as secondary sources, the researcher has analyzed that SPSS analysis is eliminated into the study.