Next Page: 10000


Comment on How to Use Word Embedding Layers for Deep Learning with Keras by Jason Brownlee

You're welcome. I require all of the code to work and keep working!

[Link] Book Review: ‘The AI Does Not Hate You’ by Tom Chivers (Scott Aaronson)

Published on October 7, 2019 6:16 PM UTC

Scott Aaronson uploaded a review about The AI Does Not Hate You, a book by Tom Chivers.

The book:

This is a book about AI and AI risk. But it's also more importantly about a community of people who are trying to think rationally about intelligence, and the places that these thoughts are taking them, and what insight they can and can't give us about the future of the human race over the next few years.

The book has a dual purpose, it gives an account of the most important events that happened on the rationalist community while informing on the current state of the AI risk field. There's a Lesswrong discussion here.

This is to talk about Scott A. post. In the review he gives his opinions about the book, including his relationship with the rationalist community and his somewhat changing views on AI risk.

Reading Chivers’s book prompted me to reflect on my own relationship to the rationalist community.
The astounding progress in deep learning and reinforcement learning and GANs, which caused me (like everyone else, perhaps) to update in the direction of human-level AI in our lifetimes being an actual live possibility.


Introducing Amazon SageMaker ml.p3dn.24xlarge instances, optimized for distributed machine learning with up to 4x the network bandwidth of ml.p3.16xlarge instances


Amazon SageMaker now supports ml.p3dn.24xlarge, the most powerful P3 instance optimized for machine learning applications. This instance provides faster networking, which helps remove data transfer bottlenecks and optimizes the utilization of GPUs to deliver maximum performance for training deep learning models.


GigaIO Optimizes Scalability of Xilinx Alveo FPGAs


Today GigaIO introduced FabreX support for Xilinx Alveo Accelerators, in addition to an exclusive offering that provides Xilinx FPGA developers with remote cloud access to the FabreX platform. In conjunction with the Xilinx Alveo family of adaptable accelerator cards, Xilinx developers will use FabreX to enhance proof of concept, software testing, and scale-out deployments in applications like artificial intelligence, deep learning inference, and high-performance computing.

The post GigaIO Optimizes Scalability of Xilinx Alveo FPGAs appeared first on insideHPC.


Comment on Cell Nuclei Detection on Whole-Slide Histopathology Images Using HistomicsTK and Faster R-CNN Deep Learning Models by jay

hi,I was trying to implement your pipeline and downloaded the datasets you mentioned in the post. I just did not know how to convert the public data into tfrecords. There is a script in lumi named which used to process the images with xml annotations to tfrecord data. Anyway, I found that this script was prepared for rectangle bounding box which is not suitable for the polygon annotations. Could u tell me how u did it? Thanks a lot!

A deep learning genome-mining strategy for biosynthetic gene cluster prediction

Natural products represent a rich reservoir of small molecule drug candidates utilized as antimicrobial drugs, anticancer therapies, and immunomodulatory agents. These molecules are microbial secondary metabolites synthesized by co-localized genes termed Biosynthetic Gene Clusters (BGCs). The increase in full microbial genomes and similar resources has led to development of BGC prediction algorithms, although their precision and ability to identify novel BGC classes could be improved. Here we present a deep learning strategy (DeepBGC) that offers reduced false positive rates in BGC identification and an improved ability to extrapolate and identify novel BGC classes compared to existing machine-learning tools. We supplemented this with random forest classifiers that accurately predicted BGC product classes and potential chemical activity. Application of DeepBGC to bacterial genomes uncovered previously undetectable putative BGCs that may code for natural products with novel biologic activities. The improved accuracy and classification ability of DeepBGC represents a major addition to in-silico BGC identification.


MIOpen: An Open Source Library For Deep Learning Primitives

Deep Learning has established itself to be a common occurrence in the business lexicon. The unprecedented success of deep learning in recent years can be attributed to: abundance of data, availability of gargantuan compute capabilities offered by GPUs, and adoption of open-source philosophy by the researchers and industry. Deep neural networks can be decomposed into […]

Exascale Deep Learning for Scientific Inverse Problems

We introduce novel communication strategies in synchronous distributed Deep Learning consisting of decentralized gradient reduction orchestration and computational graph-aware grouping of gradient tensors. These new techniques produce an optimal overlap between computation and communication and result in near-linear scaling (0.93) of distributed training up to 27,600 NVIDIA V100 GPUs on the Summit Supercomputer. We demonstrate […]

Elastic deep learning in multi-tenant GPU cluster

Multi-tenant GPU clusters are common nowadays due to the huge success of deep learning and training jobs are usually conducted with multiple distributed GPUs. These GPU clusters are managed with various goals including short JCT, high resource utilization and quick response to small jobs. In this paper, we show that elasticity, which is the ability […]

Deep Learning Engineer - Deeplite Inc. - Quebec City, QC

The platform is in development phase, where we are implementing academic works and our patent portfolio into our technology. What We Need To See.
From - Sun, 06 Oct 2019 10:42:28 GMT - View all Quebec City, QC jobs

Intel unveils new powerful W-2200 Xeon chip series

What you need to know Intel today unveiled its new W-2200 Xeon chip series. That new chips offer 2x faster 3D architecture rendering and 97% faster 4K video editing. The W-2200 Xeon chips could be included in future refreshed models of the iMac Pro. Intel calls the new chip series the "ultimate creator platform." Today Intel took the wraps off its brand new W-2200 Xeon chip series. The new chip series is compatible with the iMac Pro, which could add the new chips as a refresh looms for Apple's most powerful desktop. The W-2200 Xeon chips feature 18 AVC 512 enabled cores along with Turbo Boost Max 3.0, 48 PCIe lanes and AI acceleration with Intel Deep Learning Boost compatible with motion graphics, 3D rendering and visual effects. Among the benefits Intel touts of the new W-2200 series is 2x faster 3D architecture rendering, 97% faster 4K video editing and 2.1x faster video game compile times. As it connects to the iMac Pro, Apple currently uses Intel Xeon-W chips. However, thi...

GED Capital invierte BNext, Buguroo y Kushki a través de Conexo Ventures


La gestora ibérica independiente GED Capital lanza su primer fondo de venture capital: Conexo Ventures. Esta nueva actividad de GED Capital en el segmento del venture capital se suma a las estrategias ya existentes en la gestora de inversión en private equity e infraestructuras. A través de la misma han invertido en BNext (el primer marketplace de productos financieros, aseguradores y relacionados con viajes de España), Buguroo (prevención del fraude bancario mediante deep learning) y Kushki (plataforma de pago online para América Latina con inteligencia artificial). 

El fondo, que tiene compromisos de inversión por cerca de 20 millones de euros, busca oportunidades en startups ibéricas que desarrollen software disruptivo con modelos de negocio escalables y defendibles, preferentemente utilizando Inteligencia Artificial, para sectores de alta demanda como: ciberseguridad, pagos y marketplaces, finanzas y seguros, legal y administración y viajes y hospedaje. Su foco se centra en la inversión en Series A, en aquellas startups que tengan productos desarrollados y generen ingresos relevantes tras haber demostrado su aceptación en el mercado con métricas que sustenten la tesis de inversión.

Conexo Ventures buscará apoyar a equipos de emprendedores formados que sean exportables a Estados Unidos y ahí lograr las siguientes rondas de financiación (series B y C) y, finalmente, los exit. El equipo gestor de Conexo Ventures tiene una presencia activa en Boston y Silicon Valley. Este puente aéreo entre España y Estados Unidos servirá de fast track para las compañías más innovadoras de España y Portugal que deseen crecer en Norteamérica.

Las startups participadas por Conexo Ventures contarán con un soporte legal, de recursos humanos y ventas para su expansión internacional. Además, tendrán el apoyo del equipo gestor para la sindicación de futuras rondas de financiación con fondos de capital privado norteamericanos.

El equipo sénior ejecutivo de Conexo Ventures está compuesto por tres partners con experiencia en el mundo del emprendimiento y la inversión en venture: Joaquim Hierro, Isaac de la Peña y Damien Balsan. Los partners de Conexo Ventures tendrán una participación activa en la estrategia de las compañías y se involucrarán en la operativa junto a los fundadores de las startups.

En este sentido, Joaquim Hierro Lopes, managing partner de GED Capital, ha expresado: "Llevábamos tiempo queriendo expandir nuestra actividad al venture capital y cuando conocimos al equipo de Conexo no tuvimos ninguna duda de que eran las personas indicadas para trabajar con nosotros en esta nueva estrategia de capital privado que desarrollaremos desde nuestra gestora". Por su parte, Isaac de la Peña, partner de Conexo Ventures, ha añadido: "El apoyo de nuestro fondo será fundamental para que las startups en las que invirtamos accedan al mercado de Estados Unidos por la vía rápida. Aprovechando las eficiencias de capital del Sur de Europa, nuestra estrategia de internacionalización servirá, entre otras cosas, para obtener mayores múltiplos a la inversión mediante salidas en mercados más avanzados. Una prueba de nuestro excelente criterio de selección y acompañamiento de empresas es que dos de las operaciones que hemos hecho ya han multiplicado su valor por cuatro y cinco veces en menos de año".


Are Intel and Cray taking us off the cliff?

Click here to view original web page at "Many changes are coming in the decade of the 2020's. Despite the bitter and acrimonious fighting in Washington, life is going on in the tech world. Our Brave New World is right around the corner." TECH WARNING. This might get a bit tech deep, but these are the times we live in. This is something that Elon Musk, Glenn Beck and others have been warning us about. It has to do with things (terms) called exaflops, deep learning, chatbots and turing tests. Hello? Did I lose everybody as of yet? These terms are important as we are on the cusp of making the BIG LEAP into true AI. The super highway needed to get us to AI is the super computer. And Intel and Cray are about to unleash a doozie in 2021. Maybe some definitions would help before I continue: […]

Bird’s-AI View: How Deep Learning Helps Ornithologists Track Migration Patterns


Billions of birds in North America make the trek south each fall, migrating in pursuit of warmer winter temperatures. But at least a quarter of them don’t make it back to northern breeding grounds in the spring, falling victim to predators, weather or man-made hazards like oil pits and cell towers. Many of these migratory Read article >

The post Bird’s-AI View: How Deep Learning Helps Ornithologists Track Migration Patterns appeared first on The Official NVIDIA Blog.


University of North Dakota Deploys Bright Cluster Manager for HPC,...


UND designs new cluster to create a unified environment with Bright automation software that delivers versatility, access to the cloud and deep learning resources.

(PRWeb October 08, 2019)

Read the full story at


Computer vision: Past, present and future


Thanks to recent advances in Artificial Intelligence (AI) and deep learning, image recognition has become a reality.

The post Computer vision: Past, present and future appeared first on SAS Blogs.


Camect Smart Camera Hub Coming Soon

Camect Smart Camera Hub Coming Soon

For those of you who secure your property with cameras, or are considering doing so, there is a new product coming soon that may interest you, and later this month Chris will have a video review of it out as well. Camect is a smart camera hub that is a bit different from other security camera systems as it is not tied to proprietary camera systems, applies AI to detect objects, and keeps all footage local, though you can enable cloud backup if you wish. Captured video can still be accessed over the Internet even without the backup, but it is not uploaded to remote servers outside your control, and for two cameras, this worldwide access is free. If your security system uses additional cameras and you want to view more than two of them remotely, then there is $60 a year subscription fee.

The Camect works by first searching for video feeds from security cameras on your network and then aggregating those feeds onto its 1 TB of expandable storage. While the average home security system will have five cameras, it is able to handle about twelve 1080p cameras, of average scene complexity, so you should have some room to expand your system if you wish. The feeds from these cameras are also analyzed with a deep learning algorithm to identify the activity captured, and depending on what it is, you will be alerted. By providing feedback to the device on these notifications, it can improve its model and become more accurate, so it will let you know when a package is being delivered but not when a squirrel runs by. By providing the system public camera live streams found on YouTube, the company has created a demonstration of these alerts that you can find here: Camect Alerts Demo.

Presently there is an IndieGoGo page for Camect where you can pre-order a unit for $299, down from the usual $399 price, or $549 with lifetime service that would normally cost $1149. They are expected to ship in January 2020.

Source: Press Release


Comment on Ofsted criticises 3-year GCSEs and low EBacc entry in new inspection report by Roger Titcombe

OfSTED are right about this. Starting GCSEs in Y9 is common. It is a form of gaming performance tables in terms of GCSE Grade 4/5 It is massively damaging to students in terms of cognitive development, deep learning and ultimately post 16 progression especially to Academic and STEM A Levels However unless OfSTED penalises schools much more rigorously there will be little change. See

Ein Baukasten für Deep Learning

Das ETH-Spin-off «Mirage Technologies» hat eine Deep-Learning-Plattform entwickelt, die Start-ups und Unternehmen helfen soll, ihre Produkte schneller zu entwickeln und zu optimieren.

Machine Learning: Kalifornien will gegen Deepfake-Pornografie vorgehen

Gal Gadot in einem Porno und Obama, der Trump beleidigt: Kalifornien will stärker gegen mit Hilfe von KI gefälschten Videos vorgehen. Der Upload solcher Inhalte soll verboten werden. Die Intention sei gut, allerdings auch eine Einschränkung der Meinungsäußerung, meinen Bürgerrechtler. (Deep Learning, KI)

Intel Launches New W-2200 Xeon Chips Appropriate for an Updated iMac Pro

Intel today launched new W-2200 Cascade Lake-X Xeon chips that are potentially suitable for a new iMac Pro should Apple be planning to refresh the machine in the near future.

Right now, Apple uses custom Intel Xeon-W chips for its iMac Pro models, but could use a stock version of the W-2200 Xeon chips or a custom version.

There are up to 18 AVX 512 enabled cores in the new W-2200 chips, along with up to 48 PCIe lanes, Turbo Boost Max 3.0, and AI acceleration (Intel Deep Learning Boost) for visual effects, motion graphics, 3D rendering, and more. The chips are similar to Intel's X-Series chips but with Intel vPro for support for up to 1TB ECC RAM, VROC, and RAS (reliability, availability, and serviceability) features.

According to Intel, its new chips offer 2x faster 3D architecture rendering, 97% faster 4K video editing, and 2.1x faster video game compile times.

Intel is introducing a new, more affordable pricing structure for the updated chips, dropping prices by up to almost 50 percent compared to prior-generation Xeon chips. The pricing cuts could drive the cost of future iMac Pro models down should Apple pass those savings along to consumers.

Apple released the iMac Pro in 2017 and hasn't updated it since then, so it's due for a refresh. There are no rumors that an updated model is in the works, but we often don't hear much about minor Mac refreshes, so upgraded processors and other hardware could still come in a 2019 update.

Intel says the new Xeon W-2200 chips will be available starting in November.

Tag: Intel

This article, "Intel Launches New W-2200 Xeon Chips Appropriate for an Updated iMac Pro" first appeared on

Discuss this article in our forums


Machine Learning: Kalifornien will gegen Deepfake-Pornografie vorgehen

Gal Gadot in einem Porno und Obama, der Trump beleidigt: Kalifornien will stärker gegen mit Hilfe von KI gefälschten Videos vorgehen. Der Upload solcher Inhalte soll verboten werden. Die Intention sei gut, allerdings auch eine Einschränkung der Meinungsäußerung, meinen Bürgerrechtler. (Deep Learning, KI)

A Conditional Generative Model for Predicting Material Microstructures from Processing Methods. (arXiv:1910.02133v1 [eess.IV])


Authors: Akshay Iyer, Biswadip Dey, Arindam Dasgupta, Wei Chen, Amit Chakraborty

Microstructures of a material form the bridge linking processing conditions - which can be controlled, to the material property - which is the primary interest in engineering applications. Thus a critical task in material design is establishing the processing-structure relationship, which requires domain expertise and techniques that can model the high-dimensional material microstructure. This work proposes a deep learning based approach that models the processing-structure relationship as a conditional image synthesis problem. In particular, we develop an auxiliary classifier Wasserstein GAN with gradient penalty (ACWGAN-GP) to synthesize microstructures under a given processing condition. This approach is free of feature engineering, requires modest domain knowledge and is applicable to a wide range of material systems. We demonstrate this approach using the ultra high carbon steel (UHCS) database, where each microstructure is annotated with a label describing the cooling method it was subjected to. Our results show that ACWGAN-GP can synthesize high-quality multiphase microstructures for a given cooling method.


Learning from Fact-checkers: Analysis and Generation of Fact-checking Language. (arXiv:1910.02202v1 [cs.CL])


Authors: Nguyen Vo, Kyumin Lee

In fighting against fake news, many fact-checking systems comprised of human-based fact-checking sites (e.g., and and automatic detection systems have been developed in recent years. However, online users still keep sharing fake news even when it has been debunked. It means that early fake news detection may be insufficient and we need another complementary approach to mitigate the spread of misinformation. In this paper, we introduce a novel application of text generation for combating fake news. In particular, we (1) leverage online users named \emph{fact-checkers}, who cite fact-checking sites as credible evidences to fact-check information in public discourse; (2) analyze linguistic characteristics of fact-checking tweets; and (3) propose and build a deep learning framework to generate responses with fact-checking intention to increase the fact-checkers' engagement in fact-checking activities. Our analysis reveals that the fact-checkers tend to refute misinformation and use formal language (e.g. few swear words and Internet slangs). Our framework successfully generates relevant responses, and outperforms competing models by achieving up to 30\% improvements. Our qualitative study also confirms that the superiority of our generated responses compared with responses generated from the existing models.


A Case Study on Using Deep Learning for Network Intrusion Detection. (arXiv:1910.02203v1 [cs.CR])


Authors: Gabriel C. Fernandez, Shouhuai Xu

Deep Learning has been very successful in many application domains. However, its usefulness in the context of network intrusion detection has not been systematically investigated. In this paper, we report a case study on using deep learning for both supervised network intrusion detection and unsupervised network anomaly detection. We show that Deep Neural Networks (DNNs) can outperform other machine learning based intrusion detection systems, while being robust in the presence of dynamic IP addresses. We also show that Autoencoders can be effective for network anomaly detection.


City-level Geolocation of Tweets for Real-time Visual Analytics. (arXiv:1910.02213v1 [cs.SI])


Authors: Luke S. Snyder, Morteza Karimzadeh, Ray Chen, David S. Ebert

Real-time tweets can provide useful information on evolving events and situations. Geotagged tweets are especially useful, as they indicate the location of origin and provide geographic context. However, only a small portion of tweets are geotagged, limiting their use for situational awareness. In this paper, we adapt, improve, and evaluate a state-of-the-art deep learning model for city-level geolocation prediction, and integrate it with a visual analytics system tailored for real-time situational awareness. We provide computational evaluations to demonstrate the superiority and utility of our geolocation prediction model within an interactive system.


Colored Transparent Object Matting from a Single Image Using Deep Learning. (arXiv:1910.02222v1 [cs.CV])


Authors: Jamal Ahmed Rahim, Kwan-Yee Kenneth Wong

This paper proposes a deep learning based method for colored transparent object matting from a single image. Existing approaches for transparent object matting often require multiple images and long processing times, which greatly hinder their applications on real-world transparent objects. The recently proposed TOM-Net can produce a matte for a colorless transparent object from a single image in a single fast feed-forward pass. In this paper, we extend TOM-Net to handle colored transparent object by modeling the intrinsic color of a transparent object with a color filter. We formulate the problem of colored transparent object matting as simultaneously estimating an object mask, a color filter, and a refractive flow field from a single image, and present a deep learning framework for learning this task. We create a large-scale synthetic dataset for training our network. We also capture a real dataset for evaluation. Experiments on both synthetic and real datasets show promising results, which demonstrate the effectiveness of our method.


Self-supervised Feature Learning for 3D Medical Images by Playing a Rubik's Cube. (arXiv:1910.02241v1 [cs.CV])


Authors: Xinrui Zhuang, Yuexiang Li, Yifan Hu, Kai Ma, Yujiu Yang, Yefeng Zheng

Witnessed the development of deep learning, increasing number of studies try to build computer aided diagnosis systems for 3D volumetric medical data. However, as the annotations of 3D medical data are difficult to acquire, the number of annotated 3D medical images is often not enough to well train the deep learning networks. The self-supervised learning deeply exploiting the information of raw data is one of the potential solutions to loose the requirement of training data. In this paper, we propose a self-supervised learning framework for the volumetric medical images. A novel proxy task, i.e., Rubik's cube recovery, is formulated to pre-train 3D neural networks. The proxy task involves two operations, i.e., cube rearrangement and cube rotation, which enforce networks to learn translational and rotational invariant features from raw 3D data. Compared to the train-from-scratch strategy, fine-tuning from the pre-trained network leads to a better accuracy on various tasks, e.g., brain hemorrhage classification and brain tumor segmentation. We show that our self-supervised learning approach can substantially boost the accuracies of 3D deep learning networks on the volumetric medical datasets without using extra data. To our best knowledge, this is the first work focusing on the self-supervised learning of 3D neural networks.


Characterizing Membership Privacy in Stochastic Gradient Langevin Dynamics. (arXiv:1910.02249v1 [cs.LG])


Authors: Bingzhe Wu, Chaochao Chen, Shiwan Zhao, Cen Chen, Yuan Yao, Guangyu Sun, Li Wang, Xiaolu Zhang, Jun Zhou

Bayesian deep learning is recently regarded as an intrinsic way to characterize the weight uncertainty of deep neural networks~(DNNs). Stochastic Gradient Langevin Dynamics~(SGLD) is an effective method to enable Bayesian deep learning on large-scale datasets. Previous theoretical studies have shown various appealing properties of SGLD, ranging from the convergence properties to the generalization bounds. In this paper, we study the properties of SGLD from a novel perspective of membership privacy protection (i.e., preventing the membership attack). The membership attack, which aims to determine whether a specific sample is used for training a given DNN model, has emerged as a common threat against deep learning algorithms. To this end, we build a theoretical framework to analyze the information leakage (w.r.t. the training dataset) of a model trained using SGLD. Based on this framework, we demonstrate that SGLD can prevent the information leakage of the training dataset to a certain extent. Moreover, our theoretical analysis can be naturally extended to other types of Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods. Empirical results on different datasets and models verify our theoretical findings and suggest that the SGLD algorithm can not only reduce the information leakage but also improve the generalization ability of the DNN models in real-world applications.


Parallelizing Training of Deep Generative Models on Massive Scientific Datasets. (arXiv:1910.02270v1 [cs.DC])


Authors: Sam Ade Jacobs, Brian Van Essen, David Hysom, Jae-Seung Yeom, Tim Moon, Rushil Anirudh, Jayaraman J. Thiagaranjan, Shusen Liu, Peer-Timo Bremer, Jim Gaffney, Tom Benson, Peter Robinson, Luc Peterson, Brian Spears

Training deep neural networks on large scientific data is a challenging task that requires enormous compute power, especially if no pre-trained models exist to initialize the process. We present a novel tournament method to train traditional as well as generative adversarial networks built on LBANN, a scalable deep learning framework optimized for HPC systems. LBANN combines multiple levels of parallelism and exploits some of the worlds largest supercomputers. We demonstrate our framework by creating a complex predictive model based on multi-variate data from high-energy-density physics containing hundreds of millions of images and hundreds of millions of scalar values derived from tens of millions of simulations of inertial confinement fusion. Our approach combines an HPC workflow and extends LBANN with optimized data ingestion and the new tournament-style training algorithm to produce a scalable neural network architecture using a CORAL-class supercomputer. Experimental results show that 64 trainers (1024 GPUs) achieve a speedup of 70.2 over a single trainer (16 GPUs) baseline, and an effective 109% parallel efficiency.


A Deep Learning System That Generates Quantitative CT Reports for Diagnosing Pulmonary Tuberculosis. (arXiv:1910.02285v1 [eess.IV])


Authors: Wei Wu, Xukun Li, Peng Du, Guanjing Lang, Min Xu, Kaijin Xu, Lanjuan Li

We developed a deep learning model-based system to automatically generate a quantitative Computed Tomography (CT) diagnostic report for Pulmonary Tuberculosis (PTB) cases.501 CT imaging datasets from 223 patients with active PTB were collected, and another 501 cases from a healthy population served as negative samples.2884 lesions of PTB were carefully labeled and classified manually by professional radiologists.Three state-of-the-art 3D convolution neural network (CNN) models were trained and evaluated in the inspection of PTB CT images. Transfer learning method was also utilized during this process. The best model was selected to annotate the spatial location of lesions and classify them into miliary, infiltrative, caseous, tuberculoma and cavitary types simultaneously.Then the Noisy-Or Bayesian function was used to generate an overall infection probability.Finally, a quantitative diagnostic report was exported.The results showed that the recall and precision rates, from the perspective of a single lesion region of PTB, were 85.9% and 89.2% respectively. The overall recall and precision rates,from the perspective of one PTB case, were 98.7% and 93.7%, respectively. Moreover, the precision rate of the PTB lesion type classification was 90.9%.The new method might serve as an effective reference for decision making by clinical doctors.


Bayesian Learning-Based Adaptive Control for Safety Critical Systems. (arXiv:1910.02325v1 [eess.SY])


Authors: David D. Fan, Jennifer Nguyen, Rohan Thakker, Nikhilesh Alatur, Ali-akbar Agha-mohammadi, Evangelos A. Theodorou

Deep learning has enjoyed much recent success, and applying state-of-the-art model learning methods to controls is an exciting prospect. However, there is a strong reluctance to use these methods on safety-critical systems, which have constraints on safety, stability, and real-time performance. We propose a framework which satisfies these constraints while allowing the use of deep neural networks for learning model uncertainties. Central to our method is the use of Bayesian model learning, which provides an avenue for maintaining appropriate degrees of caution in the face of the unknown. In the proposed approach, we develop an adaptive control framework leveraging the theory of stochastic CLFs (Control Lypunov Functions) and stochastic CBFs (Control Barrier Functions) along with tractable Bayesian model learning via Gaussian Processes or Bayesian neural networks. Under reasonable assumptions, we guarantee stability and safety while adapting to unknown dynamics with probability 1. We demonstrate this architecture for high-speed terrestrial mobility targeting potential applications in safety-critical high-speed Mars rover missions.


Large-scale Mobile App Identification Using Deep Learning. (arXiv:1910.02350v1 [cs.NI])


Authors: Shahbaz Rezaei, Bryce Kroencke, Xin Liu

Many network services and tools (e.g. network monitors, malware-detection systems, routing and billing policy enforcement modules in ISPs) depend on identifying the type of traffic that passes through the network. With the widespread use of mobile devices, the vast diversity of mobile apps, and the massive adoption of encryption protocols (such as TLS), large-scale traffic classification becomes inevitable and more difficult. In this paper, we propose a deep learning model for mobile app identification. The proposed model only needs the payload of the first few packets for classification, and, hence, it is suitable even for applications that rely on early prediction, such as routing and QoS provisioning. The deep model achieves between 84% to 98% accuracy for the identification of 80 popular apps. We also perform occlusion analysis for the first time to bring insight into what data is leaked from SSL/TLS protocol that allows accurate app identification. Moreover, our traffic analysis shows that many apps generate not only app-specific traffic, but also numerous ambiguous flows. Ambiguous flows are flows generated by common functionality modules, such as advertisement and traffic analytics. Because such flows are common among many different apps, identifying the source app that generates ambiguous flows is challenging. To address this challenge, we propose a CNN+LSTM model that takes adjacent flows to learn the order and pattern of multiple flows, to better identify the app that generates them. We show that such flow association considerably improves the accuracy, particularly for ambiguous flows. Furthermore, we show that our approach is robust to mixed traffic scenarios where some unrelated flows may appear in adjacent flows. To the best of our knowledge, this is the first work that identifies the source app for ambiguous flows.


Non-Uniform Conductivity Estimation for Personalized Brain Stimulation using Deep Learning. (arXiv:1910.02420v1 [cs.LG])


Authors: Essam A. Rashed, Jose Gomez-Tames, Akimasa Hirata

Electromagnetic stimulation of the human brain is a key tool for the neurophysiological characterization and diagnosis of several neurological disorders. Transcranial magnetic stimulation (TMS) is one procedure that is commonly used clinically. However, personalized TMS requires a pipeline for accurate head model generation to provide target-specific stimulation. This process includes intensive segmentation of several head tissues based on magnetic resonance imaging (MRI), which has significant potential for segmentation error, especially for low-contrast tissues. Additionally, a uniform electrical conductivity is assigned to each tissue in the model, which is an unrealistic assumption based on conventional volume conductor modeling. This paper proposes a novel approach to the automatic estimation of electric conductivity in the human head for volume conductor models without anatomical segmentation. A convolutional neural network is designed to estimate personalized electrical conductivity values based on anatomical information obtained from T1- and T2-weighted MRI scans. This approach can avoid the time-consuming process of tissue segmentation and maximize the advantages of position-dependent conductivity assignment based on water content values estimated from MRI intensity values. The computational results of the proposed approach provide similar but smoother electric field results for the brain when compared to conventional approaches.


Using Deep Learning and Machine Learning to Detect Epileptic Seizure with Electroencephalography (EEG) Data. (arXiv:1910.02544v1 [cs.LG])


Authors: Haotian Liu, Lin Xi, Ying Zhao, Zhixiang Li

The prediction of epileptic seizure has always been extremely challenging in medical domain. However, as the development of computer technology, the application of machine learning introduced new ideas for seizure forecasting. Applying machine learning model onto the predication of epileptic seizure could help us obtain a better result and there have been plenty of scientists who have been doing such works so that there are sufficient medical data provided for researchers to do training of machine learning models.


Rethinking Kernel Methods for Node Representation Learning on Graphs. (arXiv:1910.02548v1 [cs.LG])


Authors: Yu Tian, Long Zhao, Xi Peng, Dimitris N. Metaxas

Graph kernels are kernel methods measuring graph similarity and serve as a standard tool for graph classification. However, the use of kernel methods for node classification, which is a related problem to graph representation learning, is still ill-posed and the state-of-the-art methods are heavily based on heuristics. Here, we present a novel theoretical kernel-based framework for node classification that can bridge the gap between these two representation learning problems on graphs. Our approach is motivated by graph kernel methodology but extended to learn the node representations capturing the structural information in a graph. We theoretically show that our formulation is as powerful as any positive semidefinite kernels. To efficiently learn the kernel, we propose a novel mechanism for node feature aggregation and a data-driven similarity metric employed during the training phase. More importantly, our framework is flexible and complementary to other graph-based deep learning models, e.g., Graph Convolutional Networks (GCNs). We empirically evaluate our approach on a number of standard node classification benchmarks, and demonstrate that our model sets the new state of the art.


ClearGrasp: 3D Shape Estimation of Transparent Objects for Manipulation. (arXiv:1910.02550v1 [cs.CV])


Authors: Shreeyak S. Sajjan (1), Matthew Moore (1), Mike Pan (1), Ganesh Nagaraja (1), Johnny Lee (2), Andy Zeng (2), Shuran Song (2 and 3) ((1) (2) Google (3) Columbia University)

Transparent objects are a common part of everyday life, yet they possess unique visual properties that make them incredibly difficult for standard 3D sensors to produce accurate depth estimates for. In many cases, they often appear as noisy or distorted approximations of the surfaces that lie behind them. To address these challenges, we present ClearGrasp -- a deep learning approach for estimating accurate 3D geometry of transparent objects from a single RGB-D image for robotic manipulation. Given a single RGB-D image of transparent objects, ClearGrasp uses deep convolutional networks to infer surface normals, masks of transparent surfaces, and occlusion boundaries. It then uses these outputs to refine the initial depth estimates for all transparent surfaces in the scene. To train and test ClearGrasp, we construct a large-scale synthetic dataset of over 50,000 RGB-D images, as well as a real-world test benchmark with 286 RGB-D images of transparent objects and their ground truth geometries. The experiments demonstrate that ClearGrasp is substantially better than monocular depth estimation baselines and is capable of generalizing to real-world images and novel objects. We also demonstrate that ClearGrasp can be applied out-of-the-box to improve grasping algorithms' performance on transparent objects. Code, data, and benchmarks will be released. Supplementary materials available on the project website: $\href{}{}$


PyODDS: An End-to-End Outlier Detection System. (arXiv:1910.02575v1 [cs.LG])


Authors: Yuening Li, Daochen Zha, Na Zou, Xia Hu

PyODDS is an end-to end Python system for outlier detection with database support. PyODDS provides outlier detection algorithms which meet the demands for users in different fields, w/wo data science or machine learning background. PyODDS gives the ability to execute machine learning algorithms in-database without moving data out of the database server or over the network. It also provides access to a wide range of outlier detection algorithms, including statistical analysis and more recent deep learning based approaches. PyODDS is released under the MIT open-source license, and currently available at ( with official documentations at (


Unsupervised Image Super-Resolution with an Indirect Supervised Path. (arXiv:1910.02593v1 [eess.IV])


Authors: Zhen Han, Enyan Dai, Xu Jia, Shuaijun Chen, Chunjing Xu, Jianzhuang Liu, Qi Tian

The task of single image super-resolution (SISR) aims at reconstructing a high-resolution (HR) image from a low-resolution (LR) image. Although significant progress has been made by deep learning models, they are trained on synthetic paired data in a supervised way and do not perform well on real data. There are several attempts that directly apply unsupervised image translation models to address such a problem. However, unsupervised low-level vision problem poses more challenge on the accuracy of translation. In this work,we propose a novel framework which is composed of two stages: 1) unsupervised image translation between real LR images and synthetic LR images; 2) supervised super-resolution from approximated real LR images to HR images. It takes the synthetic LR images as a bridge and creates an indirect supervised path from real LR images to HR images. Any existed deep learning based image super-resolution model can be integrated into the second stage of the proposed framework for further improvement. In addition it shows great flexibility in balancing between distortion and perceptual quality under unsupervised setting. The proposed method is evaluated on both NTIRE 2017 and 2018 challenge datasets and achieves favorable performance against supervised methods.


Deep Hyperedges: a Framework for Transductive and Inductive Learning on Hypergraphs. (arXiv:1910.02633v1 [cs.LG])


Authors: Josh Payne

From social networks to protein complexes to disease genomes to visual data, hypergraphs are everywhere. However, the scope of research studying deep learning on hypergraphs is still quite sparse and nascent, as there has not yet existed an effective, unified framework for using hyperedge and vertex embeddings jointly in the hypergraph context, despite a large body of prior work that has shown the utility of deep learning over graphs and sets. Building upon these recent advances, we propose \textit{Deep Hyperedges} (DHE), a modular framework that jointly uses contextual and permutation-invariant vertex membership properties of hyperedges in hypergraphs to perform classification and regression in transductive and inductive learning settings. In our experiments, we use a novel random walk procedure and show that our model achieves and, in most cases, surpasses state-of-the-art performance on benchmark datasets. Additionally, we study our framework's performance on a variety of diverse, non-standard hypergraph datasets and propose several avenues of future work to further enhance DHE.


Deep Kernel Learning via Random Fourier Features. (arXiv:1910.02660v1 [cs.LG])


Authors: Jiaxuan Xie, Fanghui Liu, Kaijie Wang, Xiaolin Huang

Kernel learning methods are among the most effective learning methods and have been vigorously studied in the past decades. However, when tackling with complicated tasks, classical kernel methods are not flexible or "rich" enough to describe the data and hence could not yield satisfactory performance. In this paper, via Random Fourier Features (RFF), we successfully incorporate the deep architecture into kernel learning, which significantly boosts the flexibility and richness of kernel machines while keeps kernels' advantage of pairwise handling small data. With RFF, we could establish a deep structure and make every kernel in RFF layers could be trained end-to-end. Since RFF with different distributions could represent different kernels, our model has the capability of finding suitable kernels for each layer, which is much more flexible than traditional kernel-based methods where the kernel is pre-selected. This fact also helps yield a more sophisticated kernel cascade connection in the architecture. On small datasets (less than 1000 samples), for which deep learning is generally not suitable due to overfitting, our method achieves superior performance compared to advanced kernel methods. On large-scale datasets, including non-image and image classification tasks, our method also has competitive performance.


DeshadowGAN: A Deep Learning Approach to Remove Shadows from Optical Coherence Tomography Images. (arXiv:1910.02844v1 [eess.IV])


Authors: Haris Cheong, Sripad Krishna Devalla, Tan Hung Pham, Zhang Liang, Tin Aung Tun, Xiaofei Wang, Shamira Perera, Leopold Schmetterer, Aung Tin, Craig Boote, Alexandre H.Thiery, Michael J. A. Girard

Purpose: To remove retinal shadows from optical coherence tomography (OCT) images of the optic nerve head(ONH).

Methods:2328 OCT images acquired through the center of the ONH using a Spectralis OCT machine for both eyes of 13 subjects were used to train a generative adversarial network (GAN) using a custom loss function. Image quality was assessed qualitatively (for artifacts) and quantitatively using the intralayer contrast: a measure of shadow visibility ranging from 0 (shadow-free) to 1 (strong shadow) and compared to compensated images. This was computed in the Retinal Nerve Fiber Layer (RNFL), the Inner Plexiform Layer (IPL), the Photoreceptor layer (PR) and the Retinal Pigment Epithelium (RPE) layers.

Results: Output images had improved intralayer contrast in all ONH tissue layers. On average the intralayer contrast decreased by 33.7$\pm$6.81%, 28.8$\pm$10.4%, 35.9$\pm$13.0%, and43.0$\pm$19.5%for the RNFL, IPL, PR, and RPE layers respectively, indicating successful shadow removal across all depths. This compared to 70.3$\pm$22.7%, 33.9$\pm$11.5%, 47.0$\pm$11.2%, 26.7$\pm$19.0%for compensation. Output images were also free from artifacts commonly observed with compensation.

Conclusions: DeshadowGAN significantly corrected blood vessel shadows in OCT images of the ONH. Our algorithm may be considered as a pre-processing step to improve the performance of a wide range of algorithms including those currently being used for OCT image segmentation, denoising, and classification.

Translational Relevance: DeshadowGAN could be integrated to existing OCT devices to improve the diagnosis and prognosis of ocular pathologies.


Deep Learning for mmWave Beam and Blockage Prediction Using Sub-6GHz Channels. (arXiv:1910.02900v1 [cs.IT])


Authors: Muhammad Alrabeiah, Ahmed Alkhateeb

Predicting the millimeter wave (mmWave) beams and blockages using sub-6GHz channels has the potential of enabling mobility and reliability in scalable mmWave systems. These gains attracted increasing interest in the last few years. Prior work, however, has focused on extracting spatial channel characteristics at the sub-6GHz band first and then use them to reduce the mmWave beam training overhead. This approach has a number of limitations: (i) It still requires a beam search at mmWave, (ii) its performance is sensitive to the error associated with extracting the sub-6GHz channel characteristics, and (iii) it does not normally account for the different dielectric properties at the different bands. In this paper, we first prove that under certain conditions, there exist mapping functions that can predict the optimal mmWave beam and correct blockage status directly from the sub-6GHz channel, which overcome the limitations in prior work. These mapping functions, however, are hard to characterize analytically which motivates exploiting deep neural network models to learn them. For that, we prove that a large enough neural network can use the sub-6GHz channel to directly predict the optimal mmWave beam and the correct blockage status with success probabilities that can be made arbitrarily close to one. Then, we develop an efficient deep learning model and empirically evaluate its beam/blockage prediction performance using the publicly available dataset DeepMIMO. The results show that the proposed solution can predict the mmWave blockages with more than 90$\%$ success probability. Further, these results confirm the capability of the proposed deep learning model in predicting the optimal mmWave beams and approaching the optimal data rates, that assume perfect channel knowledge, while requiring no beam training overhead...


Correlations between Word Vector Sets. (arXiv:1910.02902v1 [cs.CL])


Authors: Vitalii Zhelezniak, April Shen, Daniel Busbridge, Aleksandar Savkov, Nils Hammerla

Similarity measures based purely on word embeddings are comfortably competing with much more sophisticated deep learning and expert-engineered systems on unsupervised semantic textual similarity (STS) tasks. In contrast to commonly used geometric approaches, we treat a single word embedding as e.g. 300 observations from a scalar random variable. Using this paradigm, we first illustrate that similarities derived from elementary pooling operations and classic correlation coefficients yield excellent results on standard STS benchmarks, outperforming many recently proposed methods while being much faster and trivial to implement. Next, we demonstrate how to avoid pooling operations altogether and compare sets of word embeddings directly via correlation operators between reproducing kernel Hilbert spaces. Just like cosine similarity is used to compare individual word vectors, we introduce a novel application of the centered kernel alignment (CKA) as a natural generalisation of squared cosine similarity for sets of word vectors. Likewise, CKA is very easy to implement and enjoys very strong empirical results.


A Survey on Active Learning and Human-in-the-Loop Deep Learning for Medical Image Analysis. (arXiv:1910.02923v1 [cs.LG])


Authors: Samuel Budd, Emma C Robinson, Bernhard Kainz

Fully automatic deep learning has become the state-of-the-art technique for many tasks including image acquisition, analysis and interpretation, and for the extraction of clinically useful information for computer-aided detection, diagnosis, treatment planning, intervention and therapy. However, the unique challenges posed by medical image analysis suggest that retaining a human end-user in any deep learning enabled system will be beneficial. In this review we investigate the role that humans might play in the development and deployment of deep learning enabled diagnostic applications and focus on techniques that will retain a significant input from a human end user. Human-in-the-Loop computing is an area that we see as increasingly important in future research due to the safety-critical nature of working in the medical domain. We evaluate four key areas that we consider vital for deep learning in the clinical practice: (1) Active Learning - to choose the best data to annotate for optimal model performance; (2) Interpretation and Refinement - using iterative feedback to steer models to optima for a given prediction and offering meaningful ways to interpret and respond to predictions; (3) Practical considerations - developing full scale applications and the key considerations that need to be made before deployment; (4) Related Areas - research fields that will benefit human-in-the-loop computing as they evolve. We offer our opinions on the most promising directions of research and how various aspects of each area might be unified towards common goals.


Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks. (arXiv:1910.02934v1 [cs.LG])


Authors: Spencer Frei, Yuan Cao, Quanquan Gu

The skip-connections used in residual networks have become a standard architecture choice in deep learning due to the increased training stability and generalization performance with this architecture, although there has been limited theoretical understanding for this improvement. In this work, we analyze overparameterized deep residual networks trained by gradient descent following random initialization, and demonstrate that (i) the class of networks learned by gradient descent constitutes a small subset of the entire neural network function class, and (ii) this subclass of networks is sufficiently large to guarantee small training error. By showing (i) we are able to demonstrate that deep residual networks trained with gradient descent have a small generalization gap between training and test error, and together with (ii) this guarantees that the test error will be small. Our optimization and generalization guarantees require overparameterization that is only logarithmic in the depth of the network, while all known generalization bounds for deep non-residual networks have overparameterization requirements that are at least polynomial in the depth. This provides an explanation for why residual networks are preferable to non-residual ones.


Deep Learning with a Rethinking Structure for Multi-label Classification. (arXiv:1802.01697v2 [cs.LG] UPDATED)


Authors: Yao-Yuan Yang, Yi-An Lin, Hong-Min Chu, Hsuan-Tien Lin

Multi-label classification (MLC) is an important class of machine learning problems that come with a wide spectrum of applications, each demanding a possibly different evaluation criterion. When solving the MLC problems, we generally expect the learning algorithm to take the hidden correlation of the labels into account to improve the prediction performance. Extracting the hidden correlation is generally a challenging task. In this work, we propose a novel deep learning framework to better extract the hidden correlation with the help of the memory structure within recurrent neural networks. The memory stores the temporary guesses on the labels and effectively allows the framework to rethink about the goodness and correlation of the guesses before making the final prediction. Furthermore, the rethinking process makes it easy to adapt to different evaluation criteria to match real-world application needs. In particular, the framework can be trained in an end-to-end style with respect to any given MLC evaluation criteria. The end-to-end design can be seamlessly combined with other deep learning techniques to conquer challenging MLC problems like image tagging. Experimental results across many real-world data sets justify that the rethinking framework indeed improves MLC performance across different evaluation criteria and leads to superior performance over state-of-the-art MLC algorithms.


Task-specific Deep LDA pruning of neural networks. (arXiv:1803.08134v5 [cs.CV] UPDATED)


Authors: Qing Tian, Tal Arbel, James J. Clark

With deep learning's success, a limited number of popular deep nets have been widely adopted for various vision tasks. However, this usually results in unnecessarily high complexities and possibly many features of low task utility. In this paper, we address this problem by introducing a task-dependent deep pruning framework based on Fisher's Linear Discriminant Analysis (LDA). The approach can be applied to convolutional, fully-connected, and module-based deep network structures, in all cases leveraging the high decorrelation of neuron motifs found in the pre-decision layer and cross-layer deconv dependency. Moreover, we examine our approach's potential in network architecture search for specific tasks and analyze the influence of our pruning on model robustness to noises and adversarial attacks. Experimental results on datasets of generic objects, as well as domain specific tasks (CIFAR100, Adience, and LFWA) illustrate our framework's superior performance over state-of-the-art pruning approaches and fixed compact nets (e.g. SqueezeNet, MobileNet). The proposed method successfully maintains comparable accuracies even after discarding most parameters (98%-99% for VGG16, up to 82% for the already compact InceptionNet) and with significant FLOP reductions (83% for VGG16, up to 64% for InceptionNet). Through pruning, we can also derive smaller, but more accurate and more robust models suitable for the task.


Semi-Supervised Domain Adaptation with Representation Learning for Semantic Segmentation across Time. (arXiv:1805.04141v2 [cs.CV] UPDATED)


Authors: Assia Benbihi, Matthieu Geist, Cédric Pradalier

Deep learning generates state-of-the-art semantic segmentation provided that a large number of images together with pixel-wise annotations are available. To alleviate the expensive data collection process, we propose a semi-supervised domain adaptation method for the specific case of images with similar semantic content but different pixel distributions. A network trained with supervision on a past dataset is finetuned on the new dataset to conserve its features maps. The domain adaptation becomes a simple regression between feature maps and does not require annotations on the new dataset. This method reaches performances similar to classic transfer learning on the PASCAL VOC dataset with synthetic transformations.


Learning-based Application-Agnostic 3D NoC Design for Heterogeneous Manycore Systems. (arXiv:1810.08869v2 [cs.DC] UPDATED)


Authors: Biresh Kumar Joardar, Ryan Gary Kim, Janardhan Rao Doppa, Partha Pratim Pande, Diana Marculescu, Radu Marculescu

The rising use of deep learning and other big-data algorithms has led to an increasing demand for hardware platforms that are computationally powerful, yet energy-efficient. Due to the amount of data parallelism in these algorithms, high-performance 3D manycore platforms that incorporate both CPUs and GPUs present a promising direction. However, as systems use heterogeneity (e.g., a combination of CPUs, GPUs, and accelerators) to improve performance and efficiency, it becomes more pertinent to address the distinct and likely conflicting communication requirements (e.g., CPU memory access latency or GPU network throughput) that arise from such heterogeneity. Unfortunately, it is difficult to quickly explore the hardware design space and choose appropriate tradeoffs between these heterogeneous requirements. To address these challenges, we propose the design of a 3D Network-on-Chip (NoC) for heterogeneous manycore platforms that considers the appropriate design objectives for a 3D heterogeneous system and explores various tradeoffs using an efficient ML-based multi-objective optimization technique. The proposed design space exploration considers the various requirements of its heterogeneous components and generates a set of 3D NoC architectures that efficiently trades off these design objectives. Our findings show that by jointly considering these requirements (latency, throughput, temperature, and energy), we can achieve 9.6% better Energy-Delay Product on average at nearly iso-temperature conditions when compared to a thermally-optimized design for 3D heterogeneous NoCs. More importantly, our results suggest that our 3D NoCs optimized for a few applications can be generalized for unknown applications as well. Our results show that these generalized 3D NoCs only incur a 1.8% (36-tile system) and 1.1% (64-tile system) average performance loss compared to application-specific NoCs.


Advancing PICO Element Detection in Biomedical Text via Deep Neural Networks. (arXiv:1810.12780v3 [cs.CL] UPDATED)


Authors: Di Jin, Peter Szolovits

In evidence-based medicine (EBM), defining a clinical question in terms of the specific patient problem aids the physicians to efficiently identify appropriate resources and search for the best available evidence for medical treatment. In order to formulate a well-defined, focused clinical question, a framework called PICO is widely used, which identifies the sentences in a given medical text that belong to the four components typically reported in clinical trials: Participants/Problem (P), Intervention (I), Comparison (C) and Outcome (O). In this work, we propose a novel deep learning model for recognizing PICO elements in biomedical abstracts. Based on the previous state-of-the-art bidirectional long-short term memory (biLSTM) plus conditional random field (CRF) architecture, we add another layer of biLSTM upon the sentence representation vectors so that the contextual information from surrounding sentences can be gathered to help infer the interpretation of the current one. In addition, we propose two methods to further generalize and improve the model: adversarial training and unsupervised pre-training over large corpora. We tested our proposed approach over two benchmark datasets. One is the PubMed-PICO dataset, where our best results outperform the previous best by 5.5%, 7.9%, and 5.8% for P, I, and O elements in terms of F1 score, respectively. And for the other dataset named NICTA-PIBOSO, the improvements for P/I/O elements are 2.4%, 13.6%, and 1.0% in F1 score, respectively. Overall, our proposed deep learning model can obtain unprecedented PICO element detection accuracy while avoiding the need for any manual feature selection.


Deep Learning for Robotic Mass Transport Cloaking. (arXiv:1812.04157v2 [cs.RO] UPDATED)


Authors: Reza Khodayi-mehr, Michael M. Zavlanos

We consider the problem of Mass Transport Cloaking using mobile robots. The robots move along a predefined curve that encloses the safe zone and carry sources that collectively counteract a chemical agent released in the environment. The goal is to steer the mass flux around a desired region so that it remains unaffected by the external concentration. We formulate the problem of controlling the robot positions and release rates as a PDE-constrained optimization, where the propagation of the chemical is modeled by the Advection-Diffusion (AD) PDE. We use a Deep Neural Network (NN) to approximate the solution of the PDE. Particularly, we propose a novel loss function for the NN that utilizes the variational form of the AD-PDE and allows us to reformulate the planning problem as an unsupervised model-based learning problem. Our loss function is discretization-free and highly parallelizable. Unlike passive cloaking methods that use metamaterials to steer the mass flux, our method is the first to use mobile robots to actively control the concentration levels and create safe zones independent of environmental conditions. We demonstrate the performance of our method in simulations.


MFQE 2.0: A New Approach for Multi-frame Quality Enhancement on Compressed Video. (arXiv:1902.09707v3 [cs.CV] UPDATED)


Authors: Zhenyu Guan, Qunliang Xing, Mai Xu, Ren Yang, Tie Liu, Zulin Wang

The past few years have witnessed great success in applying deep learning to enhance the quality of compressed image/video. The existing approaches mainly focus on enhancing the quality of a single frame, not considering the similarity between consecutive frames. Since heavy fluctuation exists across compressed video frames as investigated in this paper, frame similarity can be utilized for quality enhancement of low-quality frames given their neighboring high-quality frames. This task is Multi-Frame Quality Enhancement (MFQE). Accordingly, this paper proposes an MFQE approach for compressed video, as the first attempt in this direction. In our approach, we firstly develop a Bidirectional Long Short-Term Memory (BiLSTM) based detector to locate Peak Quality Frames (PQFs) in compressed video. Then, a novel Multi-Frame Convolutional Neural Network (MF-CNN) is designed to enhance the quality of compressed video, in which the non-PQF and its nearest two PQFs are the input. In MF-CNN, motion between the non-PQF and PQFs is compensated by a motion compensation subnet. Subsequently, a quality enhancement subnet fuses the non-PQF and compensated PQFs, and then reduces the compression artifacts of the non-PQF. Also, PQF quality is enhanced in the same way. Finally, experiments validate the effectiveness and generalization ability of our MFQE approach in advancing the state-of-the-art quality enhancement of compressed video. The code of our MFQE approach is available at


RAPID: Early Classification of Explosive Transients using Deep Learning. (arXiv:1904.00014v2 [astro-ph.IM] UPDATED)


Authors: Daniel Muthukrishna, Gautham Narayan, Kaisey S. Mandel, Rahul Biswas, Renée Hložek

We present RAPID (Real-time Automated Photometric IDentification), a novel time-series classification tool capable of automatically identifying transients from within a day of the initial alert, to the full lifetime of a light curve. Using a deep recurrent neural network with Gated Recurrent Units (GRUs), we present the first method specifically designed to provide early classifications of astronomical time-series data, typing 12 different transient classes. Our classifier can process light curves with any phase coverage, and it does not rely on deriving computationally expensive features from the data, making RAPID well-suited for processing the millions of alerts that ongoing and upcoming wide-field surveys such as the Zwicky Transient Facility (ZTF), and the Large Synoptic Survey Telescope (LSST) will produce. The classification accuracy improves over the lifetime of the transient as more photometric data becomes available, and across the 12 transient classes, we obtain an average area under the receiver operating characteristic curve of 0.95 and 0.98 at early and late epochs, respectively. We demonstrate RAPID's ability to effectively provide early classifications of observed transients from the ZTF data stream. We have made RAPID available as an open-source software package ( for machine learning-based alert-brokers to use for the autonomous and quick classification of several thousand light curves within a few seconds.


Is Word Segmentation Necessary for Deep Learning of Chinese Representations?. (arXiv:1905.05526v2 [cs.CL] UPDATED)


Authors: Xiaoya Li, Yuxian Meng, Xiaofei Sun, Qinghong Han, Arianna Yuan, Jiwei Li

Segmenting a chunk of text into words is usually the first step of processing Chinese text, but its necessity has rarely been explored. In this paper, we ask the fundamental question of whether Chinese word segmentation (CWS) is necessary for deep learning-based Chinese Natural Language Processing. We benchmark neural word-based models which rely on word segmentation against neural char-based models which do not involve word segmentation in four end-to-end NLP benchmark tasks: language modeling, machine translation, sentence matching/paraphrase and text classification. Through direct comparisons between these two types of models, we find that char-based models consistently outperform word-based models. Based on these observations, we conduct comprehensive experiments to study why word-based models underperform char-based models in these deep learning-based NLP tasks. We show that it is because word-based models are more vulnerable to data sparsity and the presence of out-of-vocabulary (OOV) words, and thus more prone to overfitting. We hope this paper could encourage researchers in the community to rethink the necessity of word segmentation in deep learning-based Chinese Natural Language Processing. \footnote{Yuxian Meng and Xiaoya Li contributed equally to this paper.}


From Speech Chain to Multimodal Chain: Leveraging Cross-modal Data Augmentation for Semi-supervised Learning. (arXiv:1906.00579v2 [cs.CL] UPDATED)


Authors: Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Previously, a machine speech chain, which is based on sequence-to-sequence deep learning, was proposed to mimic speech perception and production behavior. Such chains separately processed listening and speaking by automatic speech recognition (ASR) and text-to-speech synthesis (TTS) and simultaneously enabled them to teach each other in semi-supervised learning when they received unpaired data. Unfortunately, this speech chain study is limited to speech and textual modalities. In fact, natural communication is actually multimodal and involves both auditory and visual sensory systems. Although the said speech chain reduces the requirement of having a full amount of paired data, in this case we still need a large amount of unpaired data. In this research, we take a further step and construct a multimodal chain and design a closely knit chain architecture that combines ASR, TTS, image captioning, and image production models into a single framework. The framework allows the training of each component without requiring a large number of parallel multimodal data. Our experimental results also show that an ASR can be further trained without speech and text data and cross-modal data augmentation remains possible through our proposed chain, which improves the ASR performance.


Towards Automated Infographic Design: Deep Learning-based Auto-Extraction of Extensible Timeline. (arXiv:1907.13550v2 [cs.HC] UPDATED)


Authors: Zhutian Chen, Yun Wang, Qianwen Wang, Yong Wang, Huamin Qu

Designers need to consider not only perceptual effectiveness but also visual styles when creating an infographic. This process can be difficult and time consuming for professional designers, not to mention non-expert users, leading to the demand for automated infographics design. As a first step, we focus on timeline infographics, which have been widely used for centuries. We contribute an end-to-end approach that automatically extracts an extensible timeline template from a bitmap image. Our approach adopts a deconstruction and reconstruction paradigm. At the deconstruction stage, we propose a multi-task deep neural network that simultaneously parses two kinds of information from a bitmap timeline: 1) the global information, i.e., the representation, scale, layout, and orientation of the timeline, and 2) the local information, i.e., the location, category, and pixels of each visual element on the timeline. At the reconstruction stage, we propose a pipeline with three techniques, i.e., Non-Maximum Merging, Redundancy Recover, and DL GrabCut, to extract an extensible template from the infographic, by utilizing the deconstruction results. To evaluate the effectiveness of our approach, we synthesize a timeline dataset (4296 images) and collect a real-world timeline dataset (393 images) from the Internet. We first report quantitative evaluation results of our approach over the two datasets. Then, we present examples of automatically extracted templates and timelines automatically generated based on these templates to qualitatively demonstrate the performance. The results confirm that our approach can effectively extract extensible templates from real-world timeline infographics.


Reinforcing Medical Image Classifier to Improve Generalization on Small Datasets. (arXiv:1909.05630v2 [cs.LG] UPDATED)


Authors: Walid Abdullah Al, Il Dong Yun

With the advents of deep learning, improved image classification with complex discriminative models has been made possible. However, such deep models with increased complexity require a huge set of labeled samples to generalize the training. Such classification models can easily overfit when applied for medical images because of limited training data, which is a common problem in the field of medical image analysis. This paper proposes and investigates a reinforced classifier for improving the generalization under a few available training data. Partially following the idea of reinforcement learning, the proposed classifier uses a generalization-feedback from a subset of the training data to update its parameter instead of only using the conventional cross-entropy loss about the training data. We evaluate the improvement of the proposed classifier by applying it on three different classification problems against the standard deep classifiers equipped with existing overfitting-prevention techniques. Besides an overall improvement in classification performance, the proposed classifier showed remarkable characteristics of generalized learning, which can have great potential in medical classification tasks.


Conservative set valued fields, automatic differentiation, stochastic gradient method and deep learning. (arXiv:1909.10300v2 [math.OC] UPDATED)


Authors: Jérôme Bolte, Edouard Pauwels

Modern problems in AI or in numerical analysis require nonsmooth approaches with a flexible calculus. We introduce generalized derivatives called conservative fields for which we develop a calculus and provide representation formulas. Functions having a conservative field are called path differentiable: convex, concave, Clarke regular and any semialgebraic Lipschitz continuous functions are path differentiable. Using Whitney stratification techniques for semialgebraic and definable sets, our model provides variational formulas for nonsmooth automatic differentiation oracles, as for instance the famous backpropagation algorithm in deep learning. Our differential model is applied to establish the convergence in values of nonsmooth stochastic gradient methods as they are implemented in practice.


Historical and Modern Features for Buddha Statue Classification. (arXiv:1909.12921v2 [cs.CV] UPDATED)


Authors: Benjamin Renoust, Matheus Oliveira Franca, Jacob Chan, Noa Garcia, Van Le, Ayaka Uesaka, Yuta Nakashima, Hajime Nagahara, Jueren Wang, Yutaka Fujioka

While Buddhism has spread along the Silk Roads, many pieces of art have been displaced. Only a few experts may identify these works, subjectively to their experience. The construction of Buddha statues was taught through the definition of canon rules, but the applications of those rules greatly varies across time and space. Automatic art analysis aims at supporting these challenges. We propose to automatically recover the proportions induced by the construction guidelines, in order to use them and compare between different deep learning features for several classification tasks, in a medium size but rich dataset of Buddha statues, collected with experts of Buddhism art history.


End-to-End Deep Convolutional Active Contours for Image Segmentation. (arXiv:1909.13359v2 [cs.CV] UPDATED)


Authors: Ali Hatamizadeh, Debleena Sengupta, Demetri Terzopoulos

The Active Contour Model (ACM) is a standard image analysis technique whose numerous variants have attracted an enormous amount of research attention across multiple fields. Incorrectly, however, the ACM's differential-equation-based formulation and prototypical dependence on user initialization have been regarded as being largely incompatible with the recently popular deep learning approaches to image segmentation. This paper introduces the first tight unification of these two paradigms. In particular, we devise Deep Convolutional Active Contours (DCAC), a truly end-to-end trainable image segmentation framework comprising a Convolutional Neural Network (CNN) and an ACM with learnable parameters. The ACM's Eulerian energy functional includes per-pixel parameter maps predicted by the backbone CNN, which also initializes the ACM. Importantly, both the CNN and ACM components are fully implemented in TensorFlow, and the entire DCAC architecture is end-to-end automatically differentiable and backpropagation trainable without user intervention. As a challenging test case, we tackle the problem of building instance segmentation in aerial images and evaluate DCAC on two publicly available datasets, Vaihingen and Bing Huts. Our reseults demonstrate that, for building segmentation, the DCAC establishes a new state-of-the-art performance by a wide margin.


Knowledge Mining with Azure Search | AI Show


In this episode Luis stopped by and showed how much more can really be done with Cognitive Search (with recipes to boot). Extracting structure from unstructured data is a powerful addition to Cognitive Search! The demo he presents gives an amazing step-by-step process for using Cognitive search to enrich your index.

Main Demo: [04:56]


The AI Show's Favorite Links:


Taboola and Outbrain to Merge to Create Meaningful Advertising Competitor to Facebook and Google


NEW YORK: Taboola and Outbrain, two digital advertising platforms, today announced that they have entered into an agreement to merge, subject to customary closing conditions. Both companies’ Boards of Directors have approved the transaction. The combined company will provide enhanced advertising efficacy and reach to marketers worldwide, while helping news organizations and other digital properties more effectively find growth in the years to come.

“Over the past decade, I’ve admired Outbrain and the innovation that Yaron Galai, Ori Lahav and the rest of the Outbrain team have brought to the marketplace. By joining forces, we’ll be able to create a more robust competitor to Facebook and Google, giving advertisers a more meaningful choice,” said Adam Singolda, Founder & CEO of Taboola. According to eMarketer, almost 70% of total U.S. digital advertising revenue in 2019 is controlled by only three companies-Google, Facebook and Amazon. “We’re passionate about driving growth for our customers and supporting the open web, which we consider critical in a world where walled gardens are strong, and perhaps too strong. Working together, we will continue investing to better connect advertising dollars with local and national news organizations, strengthening journalism over the next decade. This is why we’re merging; this is our mission.”

“We are excited to partner with Taboola. Both Outbrain and Taboola have a shared mission and vision of supporting quality journalism globally and delivering meaningful value to the open web marketplace,” said Yaron Galai, co-Founder and co-CEO of Outbrain. “Ori and I had a vision of helping people discover quality content online, and we see a tremendous opportunity in joining forces in order to bring the next wave of innovation to our publisher partners and advertisers. I’m confident that together, we will be able to further our mission, which we call our Lighthouse, of bringing the best, most trustworthy content discovery capabilities to users around the world.”

Upon closing, Adam Singolda, the Founder and current CEO of Taboola, will assume the CEO position of the combined company, which will operate under the Taboola brand name, with branding to be determined and to reflect the merger of the two companies. Under the terms of the merger agreement Outbrain shareholders will receive shares representing 30% of the combined company plus $250 million of cash. The Board of Directors of the combined company will consist of current Taboola and Outbrain Management and Board members. Eldad Maniv, President & COO of Taboola and David Kostman, co-CEO of Outbrain will work closely together on managing all aspects of the post-merger integration. Yaron Galai will remain committed to the success of the combined company, and actively assist with the transition for the 12 months following the closing.

“We are fortunate to have great talent at both Outbrain and Taboola,” said Eldad Maniv. “As soon as the merger closes, we will work to integrate teams, technologies and infrastructures so we can quickly accelerate growth across all dimensions. We have set aggressive goals for bringing value to our customers, driving technology innovation and delivering financial results to our shareholders through increased efficacy and innovation. By working with David and the Outbrain team, I’m confident we can achieve them.”

“For over 10 years, each company has built incredibly powerful solutions that have helped tens of thousands of publishers and advertisers thrive,” said David Kostman. “I look forward to working together with Eldad and his team to bring together the best of each company’s technology, product and business expertise to build a compelling global open web alternative to Google and Facebook.”

The combined company will have over 2,000 employees across 23 offices, serving over 20,000 clients in more than 50 countries across the North America, Latin America, Europe, Middle East and Asia-Pacific regions.

Compelling Strategic and Financial Rationale for the Merger Key strategic benefits of the merger include:

1. Increased Advertiser Choice: The combined company will be able to provide advertisers, from small businesses to global brands, with a meaningful competitive alternative to Google and Facebook-the companies currently known as the “Duopoly” that command the vast majority of digital ad spend.

2. Greater Advertising Efficiency: A unified and consolidated buying platform will provide advertisers with greater efficiencies, helping them reach their awareness, consideration and conversion goals.

3. Higher Revenue and User Engagement to Publishers, Mobile Carriers and Mobile OEMs: Through increased investment in technology and expanded reach, the combined platform will be able to increase revenue to publishers, mobile carriers and device manufacturers, and drive better user engagement.

4. Accelerated Innovation: By combining two of the strongest data science and AI teams in the industry, and by accelerating investment in R&D, the company will be able to better address the evolving needs of its partners and customers.

5. Better Consumer Experience: Increasingly, Taboola’s and Outbrain’s solutions are embraced directly by consumers to help them discover what’s interesting and new, at moments when they’re ready to explore. For example, Taboola News is now embedded in over 60 million Android devices worldwide. The combined company will be able to accelerate the development of such innovative solutions, improving people’s ability to enjoy quality journalism.

Representation J.P. Morgan Securities LLC acted as a financial advisor to Taboola. Goldman, Sachs & Co. acted as a financial advisor to Outbrain. Meitar Liquornik Geva Leshem Tal Law Offices and Davis Polk & Wardwell LLP acted as legal counsel to Taboola, and Meitar Liquornik Geva Leshem Tal Law Offices, White & Case LLP and Wilson Sonsini Goodrich & Rosati acted as legal counsel to Outbrain.

About Taboola Taboola helps people discover what’s interesting and new. The company’s platform and suite of products, powered by deep learning and the largest dataset of content consumption patterns on the open web, is used by over 20,000 companies to reach over 1.4 billion people each month. Advertisers use Taboola to reach their target audience when they’re most receptive to new messages, products and services. Digital properties, including publishers, mobile carriers and handset manufacturers, use Taboola to drive audience monetization and engagement. Some of the most innovative digital properties in the world have strong relationships with Taboola, including CNBC, NBC News, USA TODAY, BILD, Sankei, Huffington Post, Microsoft, Business Insider, The Independent, El Mundo, and Le Figaro. The company is headquartered in New York City with offices in 15 cities worldwide.





英特尔副总裁兼兼台式机、工作站和渠道事业部总经理弗兰克·索喹(Frank Soqui)在接受采访时表示,14纳米工艺处理器将把新级别的计算性能和AI加速交到专业创作者和PC爱好者手中。

英特尔新的至强W-2200和酷睿X系列处理器面向需要高端台式PC和主流工作站的专业创作者和爱好者。这些芯片具有集成英特尔Deep Learning Boost的AI加速功能,或支持比前一代快2.2倍的AI推理处理新指令功能。英特尔至强W-2200平台包括八款新处理器,分别是W-2295、W-2275、W-2265、W-2255、W-2245、W-2235、W-2225和W-2223。它们的目标是用于数据科学、视觉效果、3D渲染、复杂3D CAD、AI开发以及边缘部署。


英特尔酷睿X系列处理器为爱好者提供了额外的超频灵活性,从而提高了性能。这四款新处理器(i9-10980XE、i9-10940X、i9-10920X和i9-10900X)特别适合于照片/视频编辑、游戏开发和3D动画需要变化的高级工作流。此外,它们还提供爱好者渴求的增强功能,如英特尔Performance Maximizer,使用户可以根据酷睿X系列的单个性能DNA动态而可靠地定制调整解锁处理器。




Next Page: 10000

© Googlier LLC, 2019