Next Page: 10000


Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training

In this paper, we present a novel parallel implementation for training Gradient Boosting Decision Trees (GBDTs) on Graphics Processing Units (GPUs). Thanks to the excellent results on classification/regression and the open sourced libraries such as XGBoost, GBDTs have become very popular in recent years and won many awards in machine learning and data mining competitions. Although GPUs have demonstrated their success in accelerating many machine learning applications, it is challenging to develop an efficient GPU-based GBDT algorithm. The key challenges include irregular memory accesses, many sorting operations with small inputs and varying data parallel granularities in tree construction. To tackle these challenges on GPUs, we propose various novel techniques including (i) Run-length Encoding compression and thread/block workload dynamic allocation, (ii) data partitioning based on stable sort, and fast and memory efficient attribute ID lookup in node splitting, (iii) finding approximate split points using two-stage histogram building, (iv) building histograms with the aware of sparsity and exploiting histogram subtraction to reduce histogram building workload, (v) reusing intermediate training results for efficient gradient computation, and (vi) exploiting multiple GPUs to handle larger data sets efficiently. Our experimental results show that our algorithm named ThunderGBM can be 10x times faster than the state-of-the-art libraries (i.e., XGBoost, LightGBM and CatBoost) running on a relatively high-end workstation of 20 CPU cores. In comparison with the libraries on GPUs, ThunderGBM can handle higher dimensional problems which the libraries become extremely slow or simply fail. For the data sets the existing libraries on GPUs can handle, ThunderGBM achieves up to 10 times speedup on the same hardware, which demonstrates the significance of our GPU optimizations. Moreover, the models trained by ThunderGBM are identical to those trained by XGBoost,- and have similar quality as those trained by LightGBM and CatBoost.

How investment analysts became data miners

Banks battle for audiences with new information sets, ‘charticles’ and podcasts

Professions: Analyst II - Pricing Analytics - Fort Worth, Texas

OverviewThe Analyst II-Pricing is responsible for analysis, modeling, and reporting on factors that impact product pricing, profitability, and cash flow projections. This position will use complex analysis, forecasting, and modeling concepts to mitigate risks and identify opportunities relative to pricing and profitability of originations. This team member is responsible for summarizing and reporting this information to a variety of internal clients. This position will interact with many other departments in the interest of achieving the overall company objectives.ResponsibilitiesJOB DUTIES Develop and support complex models, analysis, and reporting related to product pricing, profitability and cash flow projections Conduct ad hoc research projects incorporating project design, data collection and analysis, summarization of findings and presentation of results Gather and analyze data to determine impact to business operations Develop and present recommendations to management in a clear, concise, convincing, and actionable format Maintain and apply knowledge of available internal and external data sources Work with other departments in the interest of achieving assigned objectives Lead projects or special assignments as required Maintain processes for tracking projects; communicate results to management Support ad hoc projects or special assignments as required Perform other duties as assigned Conform with all corporate policies and proceduresQualificationsKnowledge Advanced knowledge of finance, statistics, accounting and/or financial services Advanced knowledge of financial concepts such as ROA, ROE, IRR and cash flow modeling Knowledge of auto finance industry Knowledge of GM Financials core business functions Proficient with data mining tools such as Access, SAS and SQL Strong knowledge of PC software applications such as Excel, Word and PowerPoint Strong understanding of products and related work processes in area of responsibility Strong understanding of the sales and credit divisions Strong working knowledge of dealer and reporting systemsSkills Ability to ensure data integrity Ability to manage multiple tasks and deadlines Ability to work in SAS, SQL and Access Ability to work with minimal supervision Adaptive to a team environment Detail oriented and able to manage multiple tasks/deadlines Effective written and verbal presentation skills Excellent organizational and time management skills Good analytical skills with the ability to analyze and calculate reports based upon researched data Good quantitative skills Good skills in Excel features such as Pivot tables, VLOOKUPs and creating customized functionsEducation High School Diploma or equivalent required Associate Degree in lieu of degree,--relevant experience will be considered preferred Bachelors Degree Management, Finance, Mathematics, Statistics, Economics, in related field or equivalent work experience preferred Masters Degree Management, Finance, Mathematics, Statistics, Economics, in related field or equivalent work experience preferredExperience 2-3 years financial analysis, data mining, statistics and forecasting required 2-4 years work experience requiredWorking Conditions Occasional overtime or split shifts may be required normal office environment Fast-paced office environment Strong focus on providing quality service to internal and external customers Must be able to deal with stressful office conditions while troubleshooting problems ()

Executive: Consumer Insights Manager - Boca Raton, Florida

The Customer Experience Manager is responsible for creating Customer knowledge by providing relevant and actionable process, protocol and systems design recommendations that will shape general Consumer experience and influence the Service Cycle and marketing activities as well as the overall customer strategy. Responsible for analyzing the trends collected from customer escalations, BBB complaints, Media and survey sources to convert the data into insights and actions. Using Marketing expertise the insights will inform marketing initiatives, process, product and technology recommendations. This will be accomplished through extracting data from existing data sources (transactional and customer databases) as well as summarizing the findings for business partners. Will support and interact with the Operations, Product and Marketing teams. The CX Manager will develop a broad and deep understanding of the customer behaviors and create a customer journey map to take the customer experience to the highest levels. RESPONSIBILITIES: Define survey tools, Analyze transactional and customer data, perform cross-tabulations and statistical analysis, provide insights into consumer relationship patterns, and develop hypotheses about trends. Responsible for both recurring/standardized reporting as well as adhoc projects. Perform independent data mining and database research to identify customer opportunities. Define, design and map customer journey detailed roadmap and propose systems, service and marketing initiatives QUALIFICATIONS: Bachelor's degree in Marketing. Experience with NPS or other VOC analytics preferred. Demonstrated analytic skills (ROI, profitability, customer segmentation, etc.) Database and data analysis experience required: demonstrated use of Relational databases and programming. Familiarity with NICE tools is a plus. Proficient with MS Excel, MS PowerPoint. Must be able to take verbal business requirements and convert them into results Must be able to quickly translate data into concise and insightful executive summaries. Must be an effective project manager. This position requires an individual who is able to manage multiple projects simultaneously. This individual must also be able to work independently in a fast-paced, high-energy environment. The ability to multitask will be a key to success. Strong written and verbal skills are a must. Experience delivering C-Suite presentations is required ()

2019-10451 - Stagiaire – Data Analyst Performance Opérationnelle Air France Cargo H/F

Métier : Stratégie et développement/Etudes et Performance
Type de contrat : Convention de stage
Description du poste :
Rattaché(e) à l'entité des Indicateurs et sous la responsabilité de votre maître de stage, votre principale mission consistera à : Contribuer à améliorer le suivi de la performance, un volet prioritaire de l’entreprise, en construisant les indicateurs de mesures de nos opérations Fret. Votre mission sera globale, depuis la conception fonctionnelle jusqu’à la réalisation technique informatique et s’articulera comme ci-dessous : 1/ Apporter assistance à la  maîtrise d’ouvrage fonctionnelle (~20%) Définir, spécifier et formaliser avec les équipes métier les indicateurs de suivi de la performance (KPI) S’approprier le fonctionnement des processus et procédures métiers afin de pouvoir concevoir et modéliser des indicateurs informatiques Présenter les réalisations aux clients finaux et réaliser les formations nécessaires   2/ Data science / data mining (~30%) Identifier, extraire et agréger les données des différentes sources (bases de données MySQL, datawarehouse) Proposer de nouveaux outils ou méthodes pour améliorer le traitement et l’analyse des données   3/ Ingénierie et développement de la solution (~50%) Apporter une maintenance technique et fonctionnelle des outils existants  Développer ces indicateurs en utilisant plusieurs technologies: Web (Serveurs/Front) ou reporting via Business Object / Datawarehouse / Spotfire

Vous êtes étudiant(e) en dernière année d’école d’ingénieur ou MIAGE et vous rechercher un stage de fin d'étude pour une durée de 6 mois. Vous possédez les compétences techniques suivantes : Data analyst, ETL, business object, data visualisation La maitrise de Spotfire serait un plus et/ou technologies serveurs (SQL/PhP / framework symphony) Compétences fonctionnelles : Recueil de besoins utilisateurs - Vous aimez l’autonomie dans la gestion de vos projets, travailler avec les ‘end-users’, manipuler les data et coder… - Vous avez démontré dans vos précédentes expériences une réelle capacité à vous adapter, vous organiser et à délivrer. Conformément aux engagements pris par Air France dans son accord 2018-2020 en faveur de l’emploi des personnes en situation de handicap, nos contrats sont ouverts à tous.
Site : 45 rue de Paris 95747 ROISSY CHARLES DE GAULLE CEDEX
Niveau d'études min. requis : Bac + 4 / 2ème année grande école
Langue / Niveau :
Anglais : 3 - Confirmé


Fundamentals of Predictive Analytics with JMP, Second Edition

Название: Fundamentals of Predictive Analytics with JMP 2nd Edition
Автор: Ron Klimberg PhD, B. D. McCullough PhD
Издательство: SAS Institute
Год: 2016
Формат: epub/pdf(conv.)
Страниц: 406
Размер: 27.2 Mb
Язык: English

Written for students in undergraduate and graduate statistics courses, as well as for the practitioner who wants to make better decisions from data and models, this updated and expanded second edition of Fundamentals of Predictive Analytics with JMP(R) bridges the gap between courses on basic statistics, which focus on univariate and bivariate analysis, and courses on data mining and predictive analytics. Going beyond the theoretical foundation, this book gives you the technical knowledge and problem-solving skills that you need to perform real-world multivariate data analysis.
First, this book teaches you to recognize when it is appropriate to use a tool, what variables and data are required, and what the results might be. Second, it teaches you how to interpret the results and then, step-by-step, how and where to perform and evaluate the analysis in JMP®.

Python Data Science: The Bible. The Ultimate Beginner’s Guide to Learn Data Analysis, from the Basics and Essentials, to Advance Content! (Python Programming, Python Crash Course, Coding Made Easy Book)

Название: Python Data Science: The Bible. The Ultimate Beginner’s Guide to Learn Data Analysis, from the Basics and Essentials, to Advance Content! (Python Programming, Python Crash Course, Coding Made Easy Book)
Автор: Mark Solomon Brown
Издательство: Amazon Digital Services LLC
Год: 2019
Страниц: 116
Язык: английский
Формат: epub, pdf (conv)
Размер: 10.16 MB

Do you want to learn Python programming and enter the world of data science and researching? Do you know that you can use the Python programming language for developing mobile applications, video games and also for various scientific researching and data mining? If your answer to these questions is yes, then this Python guide is a perfect match for you, and all you need to do is to keep reading...

Data Mining for Selected Genes of Agronomic Interest from the Oil Palm Genome Using a Comparative Genomics Approach


Data Mining for Selected Genes of Agronomic Interest from the Oil Palm Genome Using a Comparative Genomics Approach

Author: Rosli, R., Aug 2018

Supervisor: Murphy, D. (Supervisor), Hunt, F. (Supervisor) & Nieuwland, J. (Supervisor)

Student thesis: Doctoral Thesis

Original languageEnglish
Awarding Institution
Award dateAug 2018


BI Analyst / Senior Analyst


A leading company operating in the financial sector seeks an experienced professional to join its dynamic team of experts based in Athens. 


Duties and responsibilities

  • Active involvement into all data extraction, validation, analysis and remediation processes, streams and projects. Work together and aligned with BI Managers’ directions in order to transform flat data to meaningful and decision supporting structured information sets and reports.
  • Design, develop and implement efficient and effective data mining processes in order to deliver accurate reports and information sets to internal customers, management and external bodies.
  • Translation, proofing and documenting of all designed, developed and executed processes to solid data flows, selection processes, connections and models. 
  • Undertake of and assigned with specific streams and/or projects of data extraction and analysis in order to support large scale projects.
  • Identification of issues and areas for improvement within entity’s data universe and information sources, propose solutions, remediation actions and efficiencies improvements and develop/implement the required tools/applications/fixes upon approval.
  • Collaborate with all business and functional lines, teams and stakeholders of the entity in order to provide solutions and high-quality services to internal customers re data extraction, analysis and reporting.


Commerical Analytics Contractor - Syneos Health Commercial Solutions - Yardley, PA

Support marketing and sales efforts quantitatively, through data mining to reveal both product and market trends and by answering key business questions.
From Syneos Health Commercial Solutions - Sat, 02 Nov 2019 02:32:13 GMT - View all Yardley, PA jobs

A Research-Practice Partnership for Developing Computational Thinking through Linguistically and Culturally Relevant CS Curriculum in Middle School


This project will develop a research-practice partnership to plan and pilot a linguistically and culturally relevant computer science curriculum in middle school with the goal of broadening the participation of emergent bilingual (or English learner) students and Latino/a students in computer science education.

Project Email: 
Partner Organization(s): 
Award Number: 
Funding Period: 
Tue, 10/01/2019 to Thu, 09/30/2021
Project Evaluator: 
Full Description: 

The University of Texas at El Paso (UTEP), together with El Paso Independent School District (EPISD), will develop a research-practice partnership (RPP) to plan and pilot a linguistically and culturally relevant computer science curriculum in middle school with the goal of broadening the participation of emergent bilingual (or English learner) students and Latino/a students in computer science (CS) education. The project will focus on the development of an RPP that can effectively help teachers use bilingual and culturally relevant tools to develop the computational thinking (CT) skills of middle school students in EPISD. By bringing together the promise of culturally relevant CS education and of dual language instruction, this project will seek an innovative solution to the problem of underrepresentation of Latinas/os and emergent bilingual students/English learners in CS education and careers. It does so through a research-practice partnership that ensures responsiveness to the needs of educational practitioners and facilitates the integration of prior NSF-funded research with existing classroom curriculum and practice. The project, together with future scaling work, potentially can serve as a model in at least two existing large networks-the NSF-funded National CAHSI INCLUDES Alliance and the New Tech Network-strengthening efforts in both to broaden participation and engagement of underrepresented students, with particular focus on CS. Through dissemination across the 60 CAHSI institutions, the proposed linguistically and culturally relevant approach could potentially contribute to broadening Hispanic and emergent bilingual participation much beyond the El Paso region. The curriculum developed collaboratively by the RPP would also be disseminated through the national New Tech Network repository of PBL curriculum, accessible to other NTN schools across the country. The model of integrating culturally responsive CT/CS instruction and linguistically responsive dual language instruction has potential to significantly advance efforts to reach, support, and engage more Hispanic youth in CS learning and careers.

The project builds upon research showing that culturally relevant CS education is a promising approach to broadening participation of minoritized students in CS and that dual language bilingual education is a successful approach to improving participation and academic achievement of emergent bilingual (or English learner) students by taking a culturally and linguistically relevant approach to CT/CS instruction for emergent bilingual and Latina/o students. Specifically, the project develops an RPP to plan, co-design, pilot, and refine a curriculum module that is bilingual (Spanish and English) and employs an existing NSF-funded culturally-relevant game-based learning platform, Sol y Agua (Akbar, et al., 2018), that uses locally familiar El Paso area geography and ecology to teach computational thinking. The project will address the following research questions: (1) In what ways and to what extent do teachers demonstrate understanding of computational thinking principles and components and of dual language principles and instructional strategies? (2) How do teachers implement a linguistically and culturally relevant PBL module using Sol y Agua game-based learning platform? And (3) In what ways and to what extent do students demonstrate learning of computational thinking principles and components during and after participating in a linguistically and culturally relevant PBL module using Sol Y Agua? The project will deploy a range of data collection including pre-post testing of teachers' knowledge and implementation of instruction, observation, video recordings of classrooms, and student written assessments and language tracking data from the software tool Sol y Agua. The research team will analyze the data using qualitative data analysis techniques as well as data mining and classification.

Alternative video text: 


Data Scientist - USA-WA-Bellevue

Education Bachelor's Degree Skills SQL Tableau Analysis Skills R Data Mining ******* is looking for an experienced Data Scientist to help drive analyses, generate insights, and influence decision ...

Data Mining In An Allied Health Organisation A Real World Experience

Data Mining In An Allied Health Organisation A Real World Experience

Web Developer

OH-CINCINNATI, Graphet is seeking an experienced Web Developer with PHP development experience. The successful candidate will plan, design, create, manage, and test business applications using PHP, MySQL, and other Java servlet and stand-alone applications. Established in 2001, our company is growing due to increasing demand for our energy data mining services. We are looking for someone who is a self-starter an

Data Science and Star Science

I recently got a review copy of Statistics, Data Mining, and Machine Learning in Astronomy. I’m sure the book is especially useful to astronomers, but those of us who are not astronomers could use it as a survey of data analysis techniques, especially using Python tools, where all the examples happen to come from astronomy. […]

Web Developer

OH-CINCINNATI, Graphet is seeking an experienced Web Developer with PHP development experience. The successful candidate will plan, design, create, manage, and test business applications using PHP, MySQL, and other Java servlet and stand-alone applications. Established in 2001, our company is growing due to increasing demand for our energy data mining services. We are looking for someone who is a self-starter an

Hyperspectral and multispectral imaging: setting the scene


Hyperspectral and multispectral imaging: setting the scene

Amigo, J. M., 2020, Hyperspectral Imaging. Amigo, J. M. (ed.). Elsevier, p. 3-16 14 p. (Data Handling in Science and Technology, Vol. 32).

Research output: Chapter in Book/Report/Conference proceedingBook chapterResearchpeer-review

Hyperspectral imaging (HSI) and multispectral imaging (MSI) are becoming increasingly popular in science. Nowadays, it is very easy to find any scientific laboratory with cameras that can measure an image of a sample at different wavelengths. Moreover, the last advances in sensing technology and data mining are an important breakthrough for the imaging technologies. The first chapter of this book is an introduction of the main concepts about HSI and MSI. A comprehensive overview of the main methods to analyze HSI and MSI are given, defining the major steps usually taken in the analysis. The last part of this introduction highlights some of the main applications of this technology covered in this book.

Original languageEnglish
Title of host publicationHyperspectral Imaging
EditorsJosé Manuel Amigo
Number of pages14
Publication date2020
ISBN (Print)978-0-444-63977-6
Publication statusPublished - 2020
SeriesData Handling in Science and Technology


QA software test Engineer

QA software test Engineer - Engenharia

At StepStone we take great strides forward to stay the leading online recruitment marketplace. The answer lies in a combination of two important factors: market leading products and truly exceptional, driven and highly talented people. Do you want to be part of a company you can be proud of and help it growing? Then keep reading. As QA / software test Engineer you will be part of an expanding international R&D team working in a fast-paced and innovative environment. The team you will join oversees building the search engines behind every site of the StepStone Group.  This involves Machine learning, Data Mining, Information Retrieval, and NLP technologies that are deployed on more than 25 platforms. You’ll work inside a Scrum development team directly among the developers and you will take ownership of all QA activities within your team. Help team analyzing user stories and/use cases/requirements for validity and feasibility Execute different levels of testing (System, Integration, a...

    Empresa: StepStone
    Tipo de trabalho: Tempo integral


QA software test Engineer

QA software test Engineer - Engenharia

At StepStone we take great strides forward to stay the leading online recruitment marketplace. The answer lies in a combination of two important factors: market leading products and truly exceptional, driven and highly talented people. Do you want to be part of a company you can be proud of and help it growing? Then keep reading. As QA / software test Engineer you will be part of an expanding international R&D team working in a fast-paced and innovative environment. The team you will join oversees building the search engines behind every site of the StepStone Group.  This involves Machine learning, Data Mining, Information Retrieval, and NLP technologies that are deployed on more than 25 platforms. You’ll work inside a Scrum development team directly among the developers and you will take ownership of all QA activities within your team. Help team analyzing user stories and/use cases/requirements for validity and feasibility Execute different levels of testing (System, Integration, a...

    Empresa: StepStone
    Tipo de trabalho: Tempo integral


How Google Is Stealing Your Personal Health Data


Expert Review by Maryam Heinen

Google, by far one of the greatest monopolies that ever existed, and poses a unique threat to anyone concerned about health, supplements, food and your ability to obtain truthful information about these and other issues.

This year, we’ve seen an unprecedented push to implement censorship across all online platforms, making obtaining and sharing crucial information about holistic health increasingly difficult.

As detailed in “Stark Evidence Showing How Google Censors Health News,” Google’s June 2019 update, which took effect June 3, effectively removed and hundreds of other natural health sites from Google search results. Google is also building a specific search tool for medical and health-related searches.1

And, while not the sole threat to privacy, Google is definitely one of the greatest. Over time, Google has positioned itself in such a way that it’s become deeply embedded in your day-to-day life, including your health.

In recent years, the internet and medicine have become increasingly intertwined, giving rise to “virtual medicine” and self-diagnosing — a trend that largely favors drugs and costly, invasive treatments — and Google has its proverbial fingers in multiple slices of this pie.

Health Data Mining Poses Unique Privacy Risks

For example, in 2016, Google partnered with WebMD, launching an app allowing users to ask medical questions.2 The following year, Google partnered with the National Alliance on Mental Illness, launching a depression self-assessment quiz which turned out to be little more than stealth marketing for antidepressants.3,4

Google and various tech startups have also been investigating the possibility of assessing mental health problems using a combination of electronic medical records and tracking your internet and social media use.

In 2018, Google researchers announced they’d created an artificial intelligence-equipped retinal scanner that can appraise your risk for a heart attack.5

According to a recent Financial Times report,6 Google, Amazon and Microsoft collect data entered into health and diagnostic sites, which is then shared with hundreds of third parties — and this data is not anonymized, meaning it’s tied to specifically to you, without your knowledge or consent.

What this means is DoubleClick, Google’s ad service, will know which prescriptions you’ve searched for on, thus providing you with personalized drug ads. Meanwhile, Facebook receives information about what you’ve searched for in WebMD’s symptom checker.

“There is a whole system that will seek to take advantage of you because you’re in a compromised state,” Tim Lebert, a computer scientist at Carnegie Mellon University told Financial Times.7 “I find that morally repugnant.”

While some find these kinds of technological advancements enticing, others see a future lined with red warning flags. As noted by Wolfie Christl, a technologist and researcher interviewed by Financial Times:8

“These findings are quite remarkable, and very concerning. From my perspective, this kind of data are clearly sensitive, has special protections

The following graphic, created by Financial Times, illustrates the flow of data from, a site that focuses on pregnancy, children’s health and parenting, to third parties, and the types of advertising these third parties then generate.

user data sent to third parties

Tech Companies Are Accessing Your Medical Records

As described in the featured Wall Street Journal video,9 a number of tech companies, including Amazon, Apple and the startup Xealth, are diving into people’s personal electronic medical records to expand their businesses.

Xealth has developed an application that is embedded in your electronic health records. Doctors who use the Xealth application — which aims to serve most health care sectors and is being rapidly adopted as a preferred “digital formulary”10 — give the company vast access to market products to their patients. The app includes lists of products and services a doctor believes might be beneficial for certain categories of patients.

When seeing a patient, the doctor will select the products and services he or she wants the patient to get, generating an electronic shopping list that is then sent to the patient. The shopping links direct the patient to purchase these items from Xealth’s third-party shopping sites, such as Amazon.

As noted in the video, “Some privacy experts worry that certain Xealth vendors can see when a patient purchased a product through Xealth, and therefore through their electronic health record.” In the video, Jennifer Miller, assistant professor at Yale School of Medicine says:

”In theory, it could boost adherence to physician recommendations, which is a huge challenge in the U.S. health care system. On the other side, there are real worries about what type of information Amazon in particular is getting access to.

So, from what I understand, when a patient clicks on that Xealth app and is taken to Amazon, the data are coded as Xealth data, which means Amazon likely knows that you purchased these products through your electronic health records.”

Amazon Is Mining Health Records

Amazon, in turn, has developed software, called Amazon Comprehend Medical, which uses artificial intelligence (AI) to mine people’s electronic health records. This software has been sold to hospitals, pharmacies, researchers and various other health care providers.

The software reveals medical and health trends that might otherwise go unnoticed. As one example, given in the video, a researcher can use this software to mine tens of thousands of health records to identify candidates for a specific research study.

While this can certainly be helpful, it can also be quite risky, due to potential inaccuracies. Doctors may enter inaccurate data for a patient, for example, data that, were it accurate, would render that patient a poor test subject.

Apple is also getting in on the action through its health app. It facilitates access to electronic medical records by importing all your records directly from your health care provider. The app is meant to be “helpful” by allowing you to pull up your medical records on your iPhone and present them to any doctor, anywhere in the world.

What Does This Mean for Your Privacy

While tech companies like Amazon and Apple claim your data are encrypted (to protect it from hacking) and that they cannot view your records directly, data breaches have become so common that such “guarantees” are next to worthless.

As noted in the video by Dudley Adams, a data use expert at the University of California, San Francisco, “No encryption is perfect. All it takes is time for that encryption to be broken.” One very real concern about having your medical records hacked into is that your information may be sold to insurance companies and your employer, which they can then use against you, either by raising your rates or denying employment.

After all, sick people cost insurance companies and employers more money, so both have a vested interest in avoiding chronically ill individuals. So, were your medical records to get out, you could potentially become uninsurable or unemployable.

Google Collects Health Data on Millions of Americans

Getting back to Google, a whistleblower recently revealed the company amassed health data from millions of Americans in 21 states through its Project Nightingale,11,12 and patients have not been informed of this data mining. As reported by The Guardian:13

“A whistleblower who works in Project Nightingale … has expressed anger to the Guardian that patients are being kept in the dark about the massive deal.

The anonymous whistleblower has posted a video on the social media platform Daily Motion that contains a document dump of hundreds of images of confidential files relating to Project Nightingale.

The secret scheme … involves the transfer to Google of healthcare data held by Ascension, the second-largest healthcare provider in the U.S. The data is being transferred with full personal details including name and medical history and can be accessed by Google staff. Unlike other similar efforts it has not been made anonymous though a process of removing personal information known as de-identification …

Among the documents are the notes of a private meeting held by Ascension operatives involved in Project Nightingale. In it, they raise serious concerns about the way patients’ personal health information will be used by Google to build new artificial intelligence and other tools.”

The anonymous whistleblower told The Guardian:

“Most Americans would feel uncomfortable if they knew their data was being haphazardly transferred to Google without proper safeguards and security in place. This is a totally new way of doing things. Do you want your most personal information transferred to Google? I think a lot of people would say no.”

On a side note, the video the whistleblower uploaded to Daily Motion has since been taken down, with a note saying the “video has been removed due to a breach of the Terms of Use.”

According to Google and Ascension, the data being shared will be used to build a search tool with machine-learning algorithms that will spit out diagnostic recommendations and suggestions for medications that health professionals can then use to guide them in their treatment.

Google claims only a limited number of individuals will have access to the data, but just how trustworthy is Google these days? Something tells me that since the data includes full personal details, they’ll have no problem figuring out a way to eventually make full use of it.

Google Acquires Fitbit

In November 2019, the company also acquired Fitbit for $2.1 billion, giving Google access to the health data of Fitbit’s 25.4 million active users14 as well. While Google says it won’t sell or use Fitbit data for Google ads, some users have already ditched their devices for fear of privacy breaches.15 As reported by The Atlantic on November 14, 2019:16

“Immediately, users voiced concern about Google combining fitness data with the sizeable cache of information it keeps on its users. Google assured detractors that it would follow all relevant privacy laws, but the regulatory-compliance discussion only distracted from the strange future coming into view.

As Google pushes further into health care, it is amassing a trove of data about our shopping habits, the prescriptions we use, and where we live, and few regulations are governing how it uses these data.”

How HIPAA Laws Actually Allow This Data Mining

The HIPAA Security Rule is supposed to protect your medical records, preventing access by third parties — including spouses — unless you specifically give your permission for records to be shared. So, just how is it that Google and other tech companies can mine them at will?

As it turns out, the Google-Ascension partnership that gives Google access to medical data is covered by a “business associate agreement” or BAA. HIPAA allows hospitals and medical providers to share your information with third parties that support clinical activities, and according to Google’s interpretation of the privacy laws and HIPAA regulations, the company is not in breach of these laws because it’s a “business associate.” 

The Department of Health and Human Services’ Office for Civil Rights has opened an investigation into the legality of this arrangement.17 As reported by The Atlantic:18

“If HHS determines that Google and its handling of private information make it something more akin to a health care provider itself (because of its access to sensitive information from multiple sources who aren’t prompted for consent), it may find Google and Ascension in violation of the law and refer the matter to the Department of Justice for potential criminal prosecution.

But whether or not the deal goes through, its very existence points to a larger limitation of health-privacy laws, which were drafted long before tech giants started pouring billions into revolutionizing health care.”

Patients Bear the Risk While Third Parties Benefit

BAA agreements only allow for the disclosure of protected health information to entities that help the medical institution to perform its health care functions. The third party is not permitted to use the data for its own purposes or in any independent way.

I personally find it hard to believe that Google would not find a way to profit from this personal health data, considering its web-like business structure that ties into countless other for-profit parties. Even if they don’t, there does not appear to be any distinct advantages to patients whose records are being shared. As reported by STAT News:19

“Jennifer Miller, a Yale medical school professor who studies patient privacy issues, said the way health information is being shared, whether legal or not, is far from ideal. Patients — whose data are shared without their knowledge or specific consent — end up with all the risks, she said, while the benefits, financial or otherwise, go to Google, Ascension, and potentially future patients.”

As reported by Health IT Security20 in March 2019, Democratic senator of Nevada, Catherine Cortez Masto, has also introduced a data privacy bill “that would require companies not covered by HIPAA to obtain explicit consent from patients before sharing health and genetic data.”

“The bill covers the collecting and storing of sensitive data, such as biometrics, genetics, or location data,” Health IT Security writes.21 “The consent form must outline how that data will be used.

And the bill will also let consumers request, dispute the accuracy of their records, and transfer or delete their data “without retribution” around price or services offered.

Further, organizations would need to apply three standards to all data collection, processing, storage, and disclosure. First, collection must be for a legitimate business or operation purpose, without subjecting individuals to unreasonable risks to their privacy.

Further, the data may not be used to discriminate against individuals for protected characteristics, such as religious beliefs. Lastly, companies may not engage in deceptive data practices.”

Google Partnership Spurs Class-Action Lawsuit

The fact that patients don’t want Google to access their medical records is evidenced by a class-action lawsuit filed in the summer of 2019 against the University of Chicago Medical Center which, like Ascension, allowed Google access to identifiable patient data through a partnership with the University of Chicago. As reported by WTTW News June 28, 2019:22

“All three institutions are named as defendants in the suit, which was filed … by Matt Dinerstein, who received treatment at the medical center during two hospital stays in 2015.

The collaboration between Google and the University of Chicago was launched in 2017 to study electronic health records and develop new machine-learning techniques to create predictive models that could prevent unplanned hospital readmissions, avoid costly complications and save lives …

The tech giant has similar partnerships with Stanford University and the University of California-San Francisco. But that partnership violated federal law protecting patient privacy, according to the lawsuit, by allowing Google to access electronic health records of ‘nearly every patient’ at the medical center from 2009 to 2016.

The suit also claims Google will use the patient data to develop commercial health care technologies … The lawsuit claims the university breached its contracts with patients by ‘failing to keep their medical information private and confidential.’ It also alleges UChicago violated an Illinois law that prohibits companies from engaging in deceptive practices with clients.”

Like Ascension, the University of Chicago claims no confidentiality breaches have been made, since Google is a business associate. However, the lawsuit claims HIPAA was still violated because medical records were shared that “included sufficient information for Google to re-identify patients.”

The lawsuit also points out that Google does indeed have a commercial interest in all of this information, and can use it by combining it with its AI and advanced machine learning.

According to the plaintiffs, Google’s acquisition of DeepMind “has allowed for Google to find connections between electronic health records and Google users’ data.” The news report also points out that:23

“In 2015, Google and DeepMind obtained patient information from the Royal Free NHS Trust Foundation to conduct a study, which a data protection watchdog organization said ‘failed to comply with data protection law.’”

Health-Tracking Shoes and Other Privacy Abominations

Google is also investing in other wearable technologies aimed at tracking users’ health data, including:24

  • Shoes designed to monitor your weight, movement and falls
  • “Smart” contact lenses for people with age-related farsightedness and those who have undergone cataract surgery25 (a glucose-sensing contact lens for diabetics was canceled in 2018 after four years of development26)
  • A smartwatch to provide information for clinical research27
  • An all-in-one insulin patch pump for Type 2 diabetics that is prefilled and connected to the internet28

Google also has big plans for expanding the use of AI in health care. According to CB Insights,29 “The company is applying AI to disease detection, new data infrastructure, and potentially insurance.”

As mentioned earlier, insurance companies can jack up premiums based on your health. So, what could possibly go wrong by having Google’s AI wired into the insurance market?

Google has also partnered with drugmaker Sanofi, which “will leverage Google’s cloud and AI technologies and integrate them into its biological innovations and scientific data which in turn will accelerate the medicine discovery process,” according to a Yahoo! Finance report.30

According to Yahoo! Finance, “the collaboration will aid in the identification of various type of treatments suitable for patients. Additionally, Google’s AI tools are likely to be utilized by Sanofi in improving marketing and supply efforts and in forecasting sales.”

In plain English, this partnership will help Sanofi sell more drugs, which can hardly be said to be for the patients’ best interest, but rather that of Sanofi and Google. As mentioned earlier, Verily, Google’s health care division, is also collaborating with Sanofi, Novartis, Otsuka and Pfizer to help them identify suitable patients for clinical drug trials.31

To boost drug sales even further, Verily is working with Walgreens to deploy a “medication adherence” project, in which patients are equipped with devices to ensure they’re taking their medication as prescribed.32

Amazon also plays a part in the drug adherence scheme with its recent buyout of Pillpack, an online pharmacy that offers prepackaged pill boxes with all the different medications you’re taking.

According to Yahoo! Finance, Amazon is also planning to develop at-home medical testing devices, and is rolling out the option to make medical-related purchases from Amazon using your health savings account. All of these things generate health-related data points that can then be used for other purposes, be it personalized marketing or insurance premium decisions.

Have You Had Enough of Google’s Privacy Intrusions Yet? 

Add to all of this data mining the fact that Google is actively manipulating search results and making decisions about what you’re allowed to see and what you’re not based on its own and third party interests — a topic detailed in a November 15, 2019 Wall Street Journal investigation.33 The dangers ahead should be self-evident.

Now more than ever we must work together to share health information with others by word-of-mouth, by text and email. We have built in simple sharing tools at the top of each article so you can easily email or text interesting articles to your friends and family.

My information is here because all of you support and share it, and we can do this without Big Tech’s support. It’s time to boycott and share! Here are a few other suggestions:

Become a subscriber to my newsletter and encourage your friends and family to do the same. This is the easiest and safest way to make sure you’ll stay up to date on important health and environmental issues.

If you have any friends or relatives that are seriously interested in their health, please share important articles with them and encourage them to subscribe to our newsletter.

Consider dumping any Android phone the next time you get a phone. Android is a Google operating system and will seek to gather as much data as they can about you for their benefit. iPhone, while not perfect, appears to have better privacy protections.

Use the internal search engine when searching for articles on my site.

Boycott Google by avoiding any and all Google products:

  • Stop using Google search engines and recognize that even engines that honor privacy like Start Page, still use Google as their back end and provide censored results. Alternatives include DuckDuckGo34 and Qwant35
  • Uninstall Google Chrome and use Brave or Opera browser instead, available for all computers and mobile devices.36 From a security perspective, Opera is far superior to Chrome and offers a free VPN service (virtual private network) to further preserve your privacy
  • If you have a Gmail account, try a non-Google email service such as ProtonMail,37 an encrypted email service based in Switzerland
  • Stop using Google docs. Digital Trends has published an article suggesting a number of alternatives38
  • If you’re a high school student, do not convert the Google accounts you created as a student into personal accounts

Sign the “Don’t be evil” petition created by Citizens Against Monopoly


Session on "Data driven materials science" at the DPG Spring Meeting (Dresden, Germany)


Dear colleagues, 

we would like to make you aware of the topical session 

"Data driven materials science"

which is part of the MM program during the DPG Spring Meeting 2020. The latter takes place March 15-20, 2020, in Dresden.  

If you are performing experiments or simulations in this emerging field, you are most welcome to contribute your abstract. You can find the session at the bottom of the list "Themenbereiche" on the abstract submission webpage ]. 

In addition we will have some outstanding invited talks given by (some of them not yet confirmed):
- Tilmann Beck (TU Kaiserslautern)
- Cecilie Hebert (EPFL Lausanne, Switzerland)
- Jan Janssen (MPIE Düsseldorf)
- Marcus J. Neuer (BFI Düsseldorf)
- Stefan Sandfeld (TU Freiberg)

Please, note that this is session is not identical with the symposium "Big data driven materials science (SYBD)", which is on invitation only. Our session in MM is more focussed on structure-composition-property relationships in materials science. Please see the abstract below for details. 

We are looking foward to a vivid session on innovative developments!  Abstract:----------
This session covers innovative high-throughput and materials-informatics approaches for the discovery, description and design of materials. The contributions should address recent developments in the fields of data mining, machine learning, and artificial intelligence for the identification of structure-composition-property relationships in the highly diverse, but often sparse materials data space. Contributions from experiment such as diffraction and various tomography techniques, materio-graphic feature identification, as well as simulation results from the atomistic up to the continuum level are foreseen. A particular focus will be on the consideration of extended materials defects (grain boundaries, stacking faults, dislocation cores) and microstructures. Furthermore, submissions of contributions on accumulating, analyzing, interpreting, storing, and sharing fundamental knowledge about materials is solicited. Contributions may range, and preferably bridge, from physics-based materials understanding to data-driven and application-oriented development.


Occupancy Planner, Background in Space Planning Preferred | CBRE

Memphis,, JOB SUMMARY The purpose of this position is to provide space planning, data mining/analysis, reporting and interpretation of space planning metrics in an effort to provide recommendations and pr

Other: Data Scientist - Applied Research - Bedford, Massachusetts

Why choose between doing meaningful work and having a fulfilling life? At MITRE, you can have both. That's because MITRE people are committed to tackling our nation's toughest challenges-and we're committed to the long-term well-being of our employees. MITRE is different from most technology companies. We are a not-for-profit corporation chartered to work for the public interest, with no commercial conflicts to influence what we do. The Research & Development centers we operate for the government create lasting impact in fields as diverse as cybersecurity, healthcare, aviation, defense, and enterprise transformation. We're making a difference every day-working for a safer, healthier, and more secure nation and world. Join the Data Analytics team, where you will bring your machine learning, advanced analytics, and applied research skills to bear on solving problems of critical national importance. MITRE's diverse work program provides opportunities to apply your expertise and creative thinking in challenging domains such as healthcare, transportation, and national security. Employees may also participate in MITRE's internal research and development program, which provides funding for innovative applied research that addresses our sponsors' hardest problems. The successful candidate for this position is an experienced data scientist who combines a solid theoretical and technical background with the ability to formulate problems, develop and evaluate solutions, and communicate results. Experience with advanced analytic techniques and methods (e.g., supervised and unsupervised machine learning, deep learning, data visualization) as well as hands-on software development skills are a must. Our organization values innovation and lifelong learning and believes that in today's fast-paced technological environment keeping up with the latest research and technologies is an essential part of working in this field. Responsibilities include: Apply a variety of analytical techniques to tackle customer challenges to include data mining, statistical models, predictive analytics, optimization, risk analysis, and data visualization Perform original research, development, test and evaluation, and demonstration of advanced analytic capabilities Build and test prototypes in MITRE, government labs, and commercial cloud environments Perform independent reviews of contractor proposed architectures, designs and products Apply state of the art techniques, using multiple programming languages and development environments and open source code to drive advances in mission capabilities Basic Qualifications: Bachelor's Degree in Data Science, Computer Science or related field At least 5 years of professional experience Required Qualifications: Must be a U.S. citizen with ability to possess and maintain DoD clearance Proficiency in use of Microsoft Office including Outlook, Excel, and Word Must have demonstrated proficiency and strength in verbal, written, PC, presentation, and communications skills Experience conducting original research using data science techniques, including machine learning, deep learning, statistical modeling, and data visualization Passion for working with data and solving real-world business problems or creatively advancing business operations Hands-on software development skills (Python, R, C , C#, Java, Javascript) Strong technical writing and presentation skills Proven ability to work effectively in a collaborative teaming environment Preferred Qualifications: Advanced degree in related field of study Candidates that possess a current/active US Government Secret clearance are preferred Demonstrated technical leadership Knowledge of Cloud Computing technologies, particularly in the AWS environment Understanding of Big Data tools (e.g. NoSQL, Spark, Hadoop, ElasticSearch) MITRE's workplace reflects our values. We offer competitive benefits, exceptional professional development opportunities, and a culture of innovation that embraces diversity, inclusion, flexibility, collaboration, and career growth. If this sounds like the choice you want to make, then choose MITRE-and make a difference with us. For more information please visit U.S Citizenship is required for most positions. ()

Next Page: 10000

© Googlier LLC, 2019