At the first Spark Summit Europe, a presentation by Intel researchers in cooperation with the Michael J. Fox Foundation proposed that Big Data could be the key to unlocking the mysteries of Parkinson’s disease. Summit Europe is the premier event bringing together the community known as Apache Spark, a fast and general Open Source engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
Summit attendees convened from Tuesday, Oct. 27, through Thursday, Oct. 29, meeting in celebrated Dutch architect Berlages magnum opus Beurs Van Berlage in Amsterdam. They heard from leading production users of Spark, Spark SQL, Spark Streaming and related projects, discovered where project development is going, and learned how to use the Spark stack in a variety of applications.
Big Data analytics’ architect and development manager Ido Karavany, with Intel’s Advanced Analytics group, addressed the Summit on leading edge technology projects within Intel involving Big Data & Stream analytic solutions in the Internet of Things (IoT) and in Parkinson disease (PD) research. Mr. Karavany has more than eight years of experience in software development in the domains of data analytics and distributed computing solutions.
In his Oct. 28 presentation entitled “Using Spark in an IoT Analytics Platform Enable breakthroughs in Parkinson Disease Research,” Mr. Karavney presented a partnership-developed approach that may enable breakthroughs in Parkinson’s disease research by leveraging wearable sensors, smartphones and big data analytics to monitor PD patients’ motor movements 24/7.
He explained that the research team has built an IoT Big Analytics platform (on Amazon Cloud Drive) based on open source technologies, such as Cloudera Distribution for Hadoop, to enable collection and processing of high data streams (up to 1 GB per patient per day). Mr. Karavney noted that the platform has been successfully used in multiple clinical trials and the project has started ramping up to connect thousands of patients 24/7 by the end of 2015.
The platform uses HBase & HDFS as its main scalable storage layer. The analytics batch layer leverages Apache Spark (over HBase & HDFS) and includes a set of complex machine learning algorithms, sophisticated event-based rule engine, an automatic change detection engine and a variety of PD-related measurements.
Examples for those are activity recognition, patients’ sleep quality, tremor detection, PD gait recognition, and others. Mr. Karavney’s presentation included an explanation of the way researchers are using Spark for implementing their machine learning algorithms.
“We’ll focus on our challenges using Spark, starting with data extracting from HBase challenges and solutions for our batch and near-real time calculations, [and] we’ll also review our solution evolution and will show what worked and didn’t work for us (i.e. Many small jobs vs. fewer consolidated larger jobs, multiple vs. single Spark contexts),” Mr.Karavney said.
You can view and download the slideshow from Mr. Karavney’s presentation here.
A YouTube video of the presentation also can be viewed here.
Parkinson’s disease (PD) is a chronic, progressive, degenerative neurological disorder of poorly understood cause, in which the nerve cells that produce the natural brain chemical dopamine are damaged and unable to produce enough of the biochemical agent. The resulting diminished dopamine levels cause a variety of problems associated with movement, including tremors (shaking), stiffness, and slowness of movement. There is currently no known cure for Parkinson’s.
“Nearly 200 years after Parkinson’s disease was first described by Dr. James Parkinson in 1817, we are still subjectively measuring Parkinson’s disease largely the same way doctors did then,” Michael J. Fox Foundation CEO Todd Sherer, PhD told Special Guest Correspondent Chrissie Cluney reporting for IoT Evolution News. “Data science and wearable computing hold the potential to transform our ability to capture and objectively measure patients actual experience of disease, with unprecedented implications for Parkinson’s drug development, diagnosis and treatment.”
“The variability in Parkinson’s symptoms creates unique challenges in monitoring progression of the disease,” Diane M. Bryant, Senior Vice President and General Manager of the Intel Data Center Group, told Ms. Cluney. “Emerging technologies can not only create a new paradigm for measurement of Parkinson’s, but as more data is made available to the medical community, it may also point to currently unidentified features of the disease that could lead to new areas of research.”
Ms. Bryant leads the worldwide organization that develops the data center platforms for the digital services economy, generating more than $14 billion in revenue in 2014. In her current role, she is building the foundation for continued growth by driving new products and technologies from high-end co-processors for supercomputers to high-density systems for the cloud, to solutions for big data analytics.
Spark Summit Europe
Michael J. Fox Foundation