Next Page: 10000

          Whats new on arXiv      Cache   Translate Page      
On Meta-Learning for Dynamic Ensemble Selection In this paper, we propose a novel dynamic ensemble selection framework using meta-learning. The …

Continue reading


          Whats new on arXiv      Cache   Translate Page      
ADEPOS: Anomaly Detection based Power Saving for Predictive Maintenance using Edge Computing In industry 4.0, predictive maintenance(PM) is one of …

Continue reading


          Astrofísicos da Universidade de Harvard: "o objeto espacial 'Oumuamua' era uma nave alienígena"      Cache   Translate Page      
No ano passado, a comunidade científica ficou maravilhada com o aparecimento de uma estranha rocha alongada e inesperada que surgira no nosso sistema solar, 'Oumuamua'. Naquela época, os cientistas não podiam determinar o que era, seja um cometa ou um asteroide, ou se era uma espaçonave alienígena danificada, como disseram alguns pesquisadores.

No entanto, o mistério continuou até agora. Agora, um novo estudo científico da prestigiosa Universidade de Harvard de astrofísica, o Dr. Shmuel Bialy e Dr. Abraham Loeb, oferece hipótese surpreendente: 'Oumuamua' poderia ter sido um misterioso artefato, um pedaço de uma tecnologia real e que pertence a uma civilização interestelar. Esta não é mais uma simples especulação infundada, é apoiada por um estudo científico conduzido por dois respeitados cientistas. Os autores deste estudo científico começaram com uma ideia simples: e se a pressão da radiação solar causou a aceleração inesperada de Oumunanua? Parece bastante razoável.
Mas como a radiação do Sol causa a aceleração observada pelos cientistas, "Oumunanua" deveria ter uma forma muito estranha. Paul Gilster, um blogueiro que escreve sobre pesquisas astronômicas revisadas por pares, explicou: "Podemos resolver restrições na área do objeto por meio de sua magnitude observada. O documento pretende mostrar que uma folha fina de cerca de 0,3 mm de espessura e um raio de cerca de 20 metros permitirá a aceleração não gravitacional calculada no papel de Micheli. Portanto, considerando o objeto como uma superfície fina, podemos imaginar uma forma cilíndrica oca ou cônica. Você pode facilmente imaginar virar um pedaço de papel curvo e olhar para a superfície da rede a partir de diferentes ângulos de visão. "

Sob essa hipótese, os cientistas escreveram em seu estudo: 'Oumuamua' é uma sonda destinada a uma missão de reconhecimento e não um membro de uma população aleatória de objetos estelares (asteroides, cometas, etc.). "Sim! Embora pareça incrível, é isso que os astrofísicos escreveram. Existem, é claro, muitas razões para o ceticismo. Por um lado, existem explicações alternativas plausíveis para a aceleração do 'Oumuamua diferente da pressão da radiação solar. Jet Propulsion Laboratory da David Farnocchia NASA formulada a hipótese: "Esta força adicional em Oumuamua fina provavelmente é causado por jatos de material gasoso ejectados a partir da sua superfície. Esse mesmo tipo de desgaseificação afeta o movimento de muitos cometas em nosso sistema solar. "
No entanto, mesmo esse ponto de vista não foi totalmente satisfatório. Passou através do nosso sistema solar, 'Oumauamua' não mostrou nenhum sinal de ter uma cauda como um cometa, o que provavelmente acompanhar um objeto acelerando devido aos jatos de gás.

Frustrante, parece que nunca teremos uma resposta definitiva sobre o que era "Oumuamua". Ele deixou nosso sistema solar e está muito longe para ver isso agora. Os cientistas ficaram surpresos quando viram que Oumauamua havia acelerado quando estava perto do Sol, praticamente zerando sua poderosa atração gravitacional. Mesmo quando estava perto, nossa tecnologia de radar e observação telescópica só conseguia capturar imagens borradas do objeto. Então ainda estamos na maior parte no escuro. Mas se pudéssemos confirmar que um objeto alienígena visitou nosso sistema solar no final, teríamos uma resposta para o famoso paradoxo do físico Enrico Fermi.
Dada a hipótese de que é improvável que os humanos sejam um único evento no universo, e desde que eras se passaram desde que a vida se tornou possível no universo, por que eles não encontraram nenhum sinal de vida extraterrestre? "Talvez já tenhamos feito isso, simplesmente não percebemos naquele momento. O estudo científico foi publicado no site arXiv. org .
Mais um post by: UFOS ONLINE

                                                  Veja o Vídeo Abaixo:


          Italian physicists came up with an equation for “perfect pizza”      Cache   Translate Page      

In Good Will Hunting, Matt Damon plays an MIT janitor who solves a nearly unsolvable math problem that gets him the opportunity to do math every single day and see a psychologist. Yay? Think how motivated he would have been if solving the equation would have resulted in the world’s most scientifically perfect pizza instead.

In a paper titled “The Physics of Baking Good Pizza,” published earlier this year in the preprint journal arXiv and unearthed by LiveScience, physicists Andrey Varlamov of the Institute of Superconductors, Oxides and Other Innovative Materials and Devices and Andreas Glatz of Northern Illinois University teamed up with food anthropologist Sergio Grasso to get to the bottom of the science behind the perfect pizza. Since today is election day in the U.S., and many of us will be eating our feelings later, we figured it was worth a look.

Turns out the secret to a perfect pizza is all about the thermodynamics of the wood-fired brick ovens used to crisp a pizza crust to golden perfection. The wood fire burns in one corner, radiating heat up the curved walls and stone floor of the oven to produce the perfect conditions for an evenly baked pie. Under ideal thermodynamic conditions, you could be eating a Margherita pizza in precisely two minutes in a brick oven heated to 625 degrees F (330 degrees C).

If you don’t have a brick pizza oven at home because your landlord has no vision, the authors devised a lengthy thermodynamic equation (as well as the first true justification for doing your math homework) that explains the physics of creating the perfect pie in the electric oven sitting in your kitchen right now. The secret? Turning the heat down to 450 degrees F (230 degrees C) for 170 seconds.

Crack the equation in their paper . . . or have Matt Damon do it for you.


           Comment on Oumuamua by Zeinish       Cache   Translate Page      
It is definitely doable. https://arxiv.org/abs/1711.03155 <blockquote> It is demonstrated that based on currently existing technologies such as from the Parker Solar Probe, launchers such as the Falcon Heavy and Space Launch System could send spacecraft with masses ranging from dozens to hundreds of kilograms to 1I/’Oumuamua, if launched in 2021. A further increase in spacecraft mass can be achieved with an additional Saturn flyby post solar Oberth maneuver. </blockquote> If you are too lazy to read the paper: Short way, using Jupiter flyby only: Launch April 30 2021, arrival April 23 2029. Spacecraft mass: SLS 122kg, Falcon Heavy 37kg Long way for patient people, using Jupiter and Saturn flyby: Launch April 30 2021, arrival September 7 2049. Spacecraft mass: SLS 1329kg, Falcon Heavy 399kg SLS is not going to happen in 2021, or any other year, so it leaves Falcon Heavy. So, everyone must put pressure on Elon Musk
          [TobagoJack] ‘We’ cannot stop the progression of totally screw up the environment, I fear, fo...      Cache   Translate Page      
‘We’ cannot stop the progression of totally screw up the environment, I fear, for ‘we’ cannot even stop wars.

For as long as we have amongst us enough people who can utter ‘history does not matter’ and ‘say ‘no’ to hospitals’, we are more likely lost, in direct proportion to number of such enough people.

I am guessing that the world needs more education, starting at the highest levels, all embracing, in varied subjects, all encompassing, and somehow be inclusive of all, even those that live in remote African villages and concrete HK jungles, and and and, in order to save the day

In any case, am of two minds re below

edition-m.cnn.com

Interstellar object may have been alien probe, Harvard paper argues, but experts are skeptical

(CNN) — A mysterious cigar-shaped object spotted tumbling through our solar system last year may have been an alien spacecraft sent to investigate Earth, astronomers from Harvard University have suggested.

The object, nicknamed 'Oumuamua, meaning "a messenger that reaches out from the distant past" in Hawaiian, was discovered in October 2017 by the Pan-STARRS 1 telescope in Hawaii.

Since its discovery, scientists have been at odds to explain its unusual features and precise origins, with researchers first calling it a comet and then an asteroid before finally deeming it the first of its kind: a new class of "interstellar objects."

A new paper by researchers at the Harvard Smithsonian Center for Astrophysics raises the possibility that the elongated dark-red object, which is 10 times as long as it is wide and traveling at speeds of 196,000 mph, might have an "artificial origin."
"'Oumuamua may be a fully operational probe sent intentionally to Earth vicinity by an alien civilization," they wrote in the paper, which has been submitted to the Astrophysical Journal Letters.
The theory is based on the object's "excess acceleration," or its unexpected boost in speed as it traveled through and ultimately out of our solar system in January.

"Considering an artificial origin, one possibility is that 'Oumuamua is a light sail, floating in interstellar space as a debris from an advanced technological equipment," wrote the paper's authors, suggesting that the object could be propelled by solar radiation.

The paper was written by Abraham Loeb, professor and chair of astronomy, and Shmuel Bialy, a postdoctoral scholar, at the Harvard Smithsonian Center for Astrophysics. Loeb has published four books and more than 700 papers on topics like black holes, the future of the universe, the search for extraterrestrial life and the first stars.

The paper points out that comparable light-sails exist on Earth.

"Light-sails with similar dimensions have been designed and constructed by our own civilization, including the IKAROS project and the Starshot Initiative. The light-sail technology might be abundantly used for transportation of cargos between planets or between stars."

In the paper, the pair theorize that the object's high speed and its unusual trajectory could be the result of it no longer being operational.

"This would account for the various anomalies of 'Oumuamua, such as the unusual geometry inferred from its light-curve, its low thermal emission, suggesting high reflectivity, and its deviation from a Keplerian orbit without any sign of a cometary tail or spin-up torques."

'Oumuamua is the first object ever seen in our solar system that is known to have originated elsewhere.

At first, astronomers thought the rapidly moving faint light was a regular comet or an asteroid that had originated in our solar system.

Comets, in particular, are known to speed up due to "outgassing," a process in which the sun heats the surface of the icy comet, releasing melted gas. But 'Oumuamua didn't have a "coma," the atmosphere and dust that surrounds comets as they melt.

Multiple telescopes focused on the object for three nights to determine what it was before it moved out of sight.

Going forward, the researchers believe we should search for other interstellar objects in our sky.

"It is exciting to live at a time when we have the scientific technology to search for evidence of alien civilizations," Loeb wrote in an email. "The evidence about `Oumuamua is not conclusive but interesting. I will be truly excited once we have conclusive evidence."

Is this just fantasy?

Other mysteries in space have previously been thought of as signs of extraterrestrial life: a mysterious radio signal, repeating fast radio bursts and even a strangely flickering star, known as Tabby's Star.

The mysterious radio signal was later determined to be coming from Earth, the repeating fast radio bursts are still being investigated, and new research suggests that Tabby's Star is flickering because of dust -- rather than being an alien megastructure.
So what does that mean for 'Oumuamua?

"I am distinctly unconvinced and honestly think the study is rather flawed," Alan Jackson, fellow at the Centre for Planetary Sciences at the University of Toronto Scarborough, wrote in an email. "Carl Sagan once said, 'extraordinary claims require extraordinary evidence' and this paper is distinctly lacking in evidence nevermind extraordinary evidence."

Jackson published a paper in the Monthly Notices of the Royal Astronomical Society in March that suggests that 'Oumuamua came from a binary star system, or a system with two stars.
Jackson said the spectral data from 'Oumuamua looks like an asteroid or a comet, while that of a solar sail would look very different. The new paper proposes that the sail has been coated in interstellar dust, which obscures its true spectral signature.

"Any functional spacecraft would almost certainly retract its solar sail once in interstellar space to prevent damage," Jackson said. "The sail is useless once away from a star so there would be no reason to leave it deployed. If it was then deployed again on entering the solar system it would be pristine. Even if it was left deployed the dust accumulation would be primarily on the leading side like bugs on a windshield."

'Oumuamua also travels in a complex tumbling spin, but a functioning solar sail would have a much smoother path and obvious radiation-driven acceleration, Jackson said. Even the spinning motion of a damaged solar sail would be far more strongly influenced by the radiation forces than seen, he explained.

The solar sail would also be thinner than the authors of the new paper describe, he said.

"The sail on IKAROS is 7.5 micrometres thick with a mass of only 0.001g/cm^2, 100 times lower than their estimate," Jackson said. "While a combined spacecraft and sail could have a higher net mass the sail itself needs to be extremely light. That would also significantly change their estimate for how far it could travel before falling apart -- though as I said, I doubt any functional craft would leave its sail deployed in interstellar space."

Solar sails also can't change course after being launched, so if 'Oumuamua was truly a solar sail, it would be traceable back to its origin. So far, there is no obvious origin for 'Oumuamua.

"Beyond that, it becomes difficult to trace because of the motion of the stars and any hypothetical alien civilisation would face the same issue in charting a course that long in the first place (aside from arguments about whether they would want to launch a craft they knew would not reach its destination for many millions of years)," Jackson said.

Concerning 'Oumuamua, there is little evidence because astronomers weren't able to observe it for long, which opens it up to speculation in the name of science.

"The thing you have to understand is: scientists are perfectly happy to publish an outlandish idea if it has even the tiniest 'sliver' of a chance of not being wrong," astrophysicist and cosmologist Katherine Mack tweeted. "But until every other possibility has been exhausted dozen times over, even the authors probably don't believe it."

But it's important to distinguish that the researchers who wrote the new paper have expertise in solar sails, so they're suggesting that 'Oumuamua could be like a solar sail, said Coryn Bailer-Jones of the Max Planck Institute for Astronomy. Bailer-Jones' paper on possible origin sites for 'Oumuamua was accepted by the Astrophysical Journal in September.
"Aliens would only come into all of this if you accept their assumption (and that's what it is; it doesn't come from the data) that 'Oumuamua is sail-like, and also assume nothing like that can be natural," Bailer-Jones wrote in an email. "In fact, they only mention the word 'alien' once, when they mention in passing that 'Oumuamua might have been targeted to intercept the solar system.

"I have no problem with this kind of speculative study," Bailer-Jones added. "It's fun and thought-provoking, and the issue of whether there is alien life out there is really important. But the paper doesn't give any evidence for aliens (and the authors don't claim that, I should note.)"

          Your Client Engagement Program Isn't Doing What You Think It Is.      Cache   Translate Page      

Amazing products without engaged clients are bound to fail, and companies claiming to have found the single best solution to client engagement are only fooling themselves.

What seems to work today to keep your clients engaged won't necessarily work tomorrow. The "optimal" client engagement tactic for your product will change over time and companies must be fluid and adaptable to accommodate ever-changing client needs and business strategies. Becoming complacent by settling for a strategy that works "for now" or "well enough" leads to risk aversion and unrealized potential. Constant recalibration is crucial, yet exploration can be costly and may lead nowhere. A principled approach to finding the right client engagement tactics at any point in time is essential.

Enter, Bandits!

Here at Stitch Fix, we prioritize personalization in every communication, interaction, and outreach opportunity we have with clients. Contextual bandits are one of the ways we enable this personalization.

In a nutshell, a contextual bandit is a framework that allows you to use algorithms to learn the most effective strategy for each individual client, while simultaneously using randomization to continuously track how successful each of your different action choices are.

Implementing a contextual-bandit-based client engagement program will allow you to:

  1. Understand how the performance of your tactics change over time;
  2. Select a personalized tactic for each client based on his or her unique characteristics;
  3. Introduce new tactics relevant to subpopulations of clients in a systematic manner; and
  4. Continuously refine and improve your algorithms.

Part 1: There are Significant Limitations to Typical Client Engagement Approaches.

Let's set up a simple, clear example.

Current state: There are a group of individuals who have used your product in the past, but are no longer actively engaged. You want to remind them about your great product.

Proposed idea: We'll do an email campaign. Clients who haven't interacted with you for a while will be eligible to receive this email, and the purpose is to get them to visit your website. (Note that instead of email, we could just as easily use a widget on a website, a letter in the mail, or any other method of communicating with clients.)

The One Size Fits All Method

Your team brainstorms several tactics and decides to run a test to see which one works best. Let's use a simple three-tactic example here:

  1. Tactic A: no email. This is our control, which we can use to establish a baseline for client behavior;
  2. Tactic B: an emailed invitation to work 1-on-1 with (in our case) an expert stylist via email to make sure we get you exactly what you want; and
  3. Tactic C: an emailed promotional offer.

You run this test and find that Tactic B works the best. As such, you decide to scale Tactic B out to all clients, since it is expected to maximize return.

This is an endpoint in many marketing and product pipelines. The team celebrates the discovery of the "best" strategy and now it's time to move on to the next project; Case closed.

Main Limitations with this Approach

  1. You have no idea how long Tactic B will continue to be the best. Let's say over the next year your product improves and expands, leading to a change in the demographics of your client base. How confident are you that Tactic B is still the best?
  2. There is no personalization. All clients are receiving Tactic B. Some subsegment of clients likely would have performed better with Tactic C, but by scaling out Tactic B to all clients, these clients did not receive their optimal tactic.
  3. You are not taking advantage of key pieces of information. Since everyone in this audience was a previous client, you have information on how they interacted with you in the past! To address this, teams frequently build out a decision tree that groups their clients into broad categories (such as by age or tenure with the company). However…
  4. Using decision trees to group clients into categories leads to unoptimized outreach programs. Each category of clients might get a different tactic, and as new tactics or categories are created, these trees can grow larger and larger. Many of you have seen it: 10+ branch decision trees that try to segment a client base into different categories. When these trees grow too large, not only do they become difficult to manage, but it becomes more and more questionable whether or not you are actually doing what you think you are doing.
Does this look like your program?

Part 2: Multi-Armed Bandits Allows Continuous Monitoring

The above situation is common, suboptimal, and headache-inducing. Developing a testing and implementation strategy with bandits can remove or reduce many of these limitations.

A standard multi-armed bandit is the most basic bandit implementation. It allows us to allocate a small amount of clients to continuously explore how different tactics are performing, while giving the majority of clients the current one-size-fits-all best-performing tactic. The standard implementation updates which tactic is best after every client interaction, allowing you to quickly settle on the most effective large-scale tactic.

To get into more technical detail, a multi-armed bandit is a system where you must select one action from a set of possible actions for a given 'resource' (in our example, the 'resource' is a client). The 'reward' (if a client responds to the offer or not) for the selected action is exposed, but the reward for all other actions remains unknown (we don't know what a specific client would have done if we had sent them a different offer!).

The reward for a given action can be thought of as a random reward drawn from a probability distribution specific to that action. Because these probability distributions may not be known or may change over time, we want this system to allocate some resources to improving our understanding of the different choices ('exploration') while simultaneously maximizing our expected gains based on our historical knowledge ('exploitation'). Our goal is to minimize "regret," defined as the difference between the sum of rewards if we used an optimal strategy and the actual sum of rewards realized.

Mathematically, if we have a bandit with K choices (K arms), we can define a reward for 1 ≤ i ≤ K and n ≥ 1, where i is an arm of the bandit and n is the round we are currently considering.

Each round yields a reward The reward of each round is a function of which arm was selected for the client, and is assumed to be independent of both the rewards from previous rounds and the distributions of the other arms, and only dependent upon the probability distribution associated with the selected arm.

We thus want to minimized our regret, defined as:

, where total number of rounds, maximum reward, and is the reward in round n from selecting arm i.

There are numerous strategies that can be utilized to select how clients are allocated to either exploitation or exploration in order to minimize the regret of your bandit. For this article, we will consider the simplest, called the epsilon-greedy algorithm. In the epsilon-greedy algorithm, the best action is selected for 1−ϵ of your audience entering your program, and a random action is selected for the remaining ϵ of your audience. ϵ can be set to any value, depending on how many resources you want to allocate to exploration. For example, if ϵ is set to 0.1, then 10% of your audience is being directed to exploration, and 90% of your audience is being directed to your best tactic. If desired, ϵ can be decreased over time to reduce the total regret of your system. Other popular allocation strategies that can reduce the overall regret of your system include Thompson sampling and UCB1[1][2].

Back to our Example

Let’s get back to our 3-tactic email test, and set up a standard multi-armed bandit. We need to decide to reach out to a client with either Tactic A, B, or C. We are going to use an epsilon-greedy approach and set ϵ=0.1. 10% of our clients will randomly be presented with one of the three tactics. The other 90% will receive whichever tactic is the current top-performer. Each time we give a client a tactic, we update the score for that tactic. We then examine the performance of the different tactics, and re-assign the best-performing tactic to the one with the highest score.

In the early stages of a standard multi-armed bandit, switching may occur frequently in the first few passes, but the pipeline will quickly stabalize.

The above animation demonstrates a simple, standard multi-armed bandit to help get you started thinking about how a bandit implementation might look. This example already allows us to continuously monitor how tactics are performing and actively redirects most of our audience to the current "best" tactic, but there is plenty more we can do to improve performance! For example, if we want to further minimize regret, we would want to use a more powerful regret-minimization technique (UCB1, Thompson Sampling, etc.) If we are nervous about our client population changing over time, we can implement a forgetting factor to down-weight older data points. New tactics can be added to this framework in a similar manner.

Part 3: Personalize Outreach with Contextual Bandits

While we have already improved upon our original example, let's take this a step further to get to true personalization.

Contextual bandits provide an extension to the bandit framework where a context (feature-vector) is associated with each client.[3] This allows us to personalize the action taken on each individual client, rather than simply applying the overall best tactic. For example, while Tactic B might perform the best if applied to all clients, there are certainly some clients who would respond better to Tactic C.

What this means in terms of our example is that instead of 90% of our clients being sent the one-size-fits-all best email, these clients instead enter our "selection algorithm." We then read in relevant features of these clients to decide which outreach tactic best suits each client’s needs and results in the best outcomes. We continue to assign 10% clients to a random tactic, because this lets us know the unbiased performance of each tactic and provides us with data to periodically retrain our selection algorithm.

To get started, we need to take some unbiased data and train a machine learning model. Let’s assume we have been running the multi-arm bandit above: because we randomly assigned clients to our different tactics, we can use this data to train our algorithm! The point here is to understand which clients we should be assigning to each tactic, or, for a given client, which tactic gives this client the highest probability of having a positive outcome.

In a Contextual Bandit, we use client features to select the best tactic for the majority of our clients and continue to pull in data from our randomly allocated clients to retrain our algorithm.

While a multi-armed bandit can be relatively quick to implement, a contextual bandit adds a decent amount of complexity to our problem. In addition to training and applying a machine learning model to the majority of clients, there are a number of additional behind-the-scenes steps that must be taken: for example, we need to establish a cadence for retraining this model as we continue to acquire more data, as well as build out significantly more logging in order to track from which pathway clients are being assigned to different tactics.

Why do we care about which pathway clients received an offer from? If we use an algorithm to assign a tactic to a client, clients with certain characteristics are more likely to be assigned to certain tactics. This is why it is important to have some clients assigned randomly: we can be confident that, in the random assignments, the underlying client distributions are the same for all tactics.

Contextual Considerations

Depending on the regret-minimization strategy, there may be restrictions on the type of model you can build. For example, using an epsilon-greedy strategy allows you to use any model you like. However, with Thompson sampling, using a non-linear model introduces significant additional challenges.[5] When deciding how to approach regret minimization, make sure to take your desired modeling approach into account.

No implementation is perfect, and bandits are no exception. Both multi-armed bandits and contextual bandits are best utilized when we have a clear, well-defined 'reward' – maybe this is clicking on a webpage banner, or clicking through an email, or purchasing something. If you can't concretely define a reward, you can't say whether or not a tactic was successful, let alone train a model. In addition, if you have significantly delayed feedback you will have to do some additional work to get everything running smoothly[4]. Contextual bandits also lead to more complex code, which gives more room for things to break when compared to implementing simpler strategies.

Finally, think carefully about real-world restrictions: for example, while a contextual bandit can support a large number of tactics, if you have very few clients entering the program you are not going to be gaining much information from your random test group ϵ . In this case, it may be wise to restrict the number of tactics to something manageable. Remember though, you can always remove tactics to make room for new ones!

Part 4: Summary and Final Thoughts

Expanding your program from using business-logic-driven decision making to model-based decision making can significantly improve the performance of your client engagement strategies, and bandits can be a great tool to facilitate this transition.

While the example we used involved email, this same pipeline can be applied in a variety of different domains. Other examples include prioritizing widgets on a webpage, modifying the flow clients experience as they click through forms on your website, or many other situations where you are performing client outreach.

If you currently have a single tactic scaled out to all clients and are not gathering any unbiased data, one viable approach to improve this situation would be to first transition to a multi-armed bandit, and then at a later timepoint transition to a contextual bandit. A basic multi-armed bandit can be quick to implement and allow you to begin gathering unbiased data. Eventually, utilizing a contextual bandit in your client engagement strategy will allow you to active adjust to changing client climates and needs, continuously test strategies, and personalize, personalize, personalize!

References

[1]↩ Minimizing Regret: https://link.springer.com/content/pdf/10.1023%2FA%3A1013689704352.pdf
[2]↩ Thompson sampling: http://proceedings.mlr.press/v23/agrawal12/agrawal12.pdf
[3]↩ Contextual bandits seminal paper: https://arxiv.org/pdf/1003.0146.pdf
[4]↩ Delayed Feedback: https://arxiv.org/pdf/1803.10937.pdf
[5]↩ Another Thompson Sampling Paper: https://pdfs.semanticscholar.org/1c21/2b33a91d7b1c9878af0395d4992a6d4e0d54.pdf

          Estranho: Objeto que cruzou sistema solar pode ser nave alienígena.      Cache   Translate Page      

Um misterioso objeto rochoso em formato de charuto que cruzou nosso sistema solar no ano passado pode ser uma espaçonave alienígena, sugeriram astrônomos da Universidade de Harvard, nos Estados Unidos.

Batizado de Oumuamua, que significa “mensageiro de muito longe que chega primeiro” em havaiano, o objeto espacial foi o primeiro a viajar de outro sistema planetário para o nosso. Foi descoberto pelo telescópio Pan-STARRS 1, instalado no Havaí, em outubro de 2017.

Desde a sua passagem, os cientistas têm dificuldade em explicar suas características incomuns e sua origem precisa. Inicialmente, os pesquisadores o classificaram como um cometa e, depois, como asteroide, antes de finalmente considerá-lo um novo tipo de “objeto interestelar”.

Agora, um novo estudo de pesquisadores do Harvard Smithsonian Center for Astrophysics, da Universidade de Harvard, nos Estados Unidos, levanta a possibilidade de o objeto ter uma “origem artificial”.

“Oumuamua pode ser uma sonda totalmente operacional enviada intencionalmente à vizinhança terrestre por uma civilização alienígena”, escreveram os astrônomos no artigo, que foi submetido ao jornal científico americano Astrophysical Journal Letters. Os autores da tese são Abraham Loeb, professor de astronomia, e Shmuel Bialy, um pós-doutor, ambos estudiosos de Harvard.

A teoria baseia-se na “aceleração excessiva” do objeto ou em seu aumento inesperado de velocidade, segundo os pesquisadores. A estrutura escura chegou a alcançar 315.000 quilômetros por hora e saiu do nosso sistema solar em janeiro de 2018.

Além disso, Oumuamua apresenta uma rotação rápida e uma variação de brilho de até dez vezes, bem mais intensa do que qualquer outra já observada.

Ainda segundo os astrônomos, a suposta nave espacial tem um formato semelhante ao da nave LightSail-1, um projeto de vela solar desenvolvido pela Sociedade Planetária, com sede nos Estados Unidos, e que se assemelha a uma pipa.

“A tecnologia light-sail pode ser usada de forma abundante para o transporte de cargas entre planetas ou entre estrelas”, dizem os cientistas.
Objeto antigo

Outro estudo publicado em maio por cientistas do Brasil e da França sugere uma tese diferente sobre a origem do Oumuamua. Com base em uma simulação computacional, a publicação indicou que ele foi formado naturalmente em outro sistema e capturado por forças gravitacionais quando nosso sistema solar se formou de uma nuvem de gás e poeira, há cerca de 4,5 bilhões de anos.

“Este é um forte candidato a objeto mais velho no sistema solar”, disse o astrônomo Fathi Namouni, do Observatório de Côte d’Azur, na França. O estudo foi publicado na revista científica Monthly Notices of the Royal Astronomical Society: Letters. A pesquisa foi realizada por Fathi Namouni e Helena Morais, da Universidade Estadual Paulista (Unesp) em Rio Claro.

Segundo Morais, o Oumuamua não se instalou permanentemente na órbita solar porque sua velocidade é tão alta que a atração do Sol foi suficiente apenas para curvar sua trajetória, tornando sua órbita um pouco mais hiperbólica. “Precisaria ter vindo com menos velocidade para que a trajetória se tornasse elíptica e fosse assim capturado pelo sistema solar”, afirmou a cientista.

O asteroide com trajetória retrógrada teria sido atraído para o campo gravitacional de Júpiter no final da época de formação dos planetas.

“Esta descoberta nos conta que o sistema solar provavelmente pode ser lar de mais asteroides extrassolares e cometas capturados mais cedo em sua história. Alguns destes objetos podem ter colidido com a Terra no passado, possivelmente carregando água, biomoléculas ou até mesmo matéria orgânica”, acrescentou Morais.

(Com Estadão Conteúdo e Reuters)

          Weird Traveling Space Boulder Could Be Alien Ship, Say Harvard Scientists      Cache   Translate Page      

A massive flattened boulder-like object traveling through space with “peculiar acceleration” could be an alien craft, according to a new study.

It “may be a fully operational probe sent intentionally to Earth vicinity by an alien civilization,” states a paper by two Harvard scientists to be published Nov. 12 in The Astrophysical Journal Letters.

Scientists have been confounded by the interstellar object first spotted tumbling past the sun a year ago via telescopes on Maui. It was dubbed ’Oumuamua, which means “scout” in Hawaiian.

Because of its unusual cigar shape and speed, some quickly speculated that it originated from an alien civilization. It was scanned for radio waves, but none were detected. Other scientists deemed it a comet, despite the lack of a traditional tail.

Now professor Avi Loeb, chairman of Harvard’s astronomy department, and post-doctoral fellow Shmuel Bialy have again raised the possibility that it’s an alien ship — or possibly a piece of one.

A certain “discrepancy” in the object’s movement “is readily solved if ’Oumuamua does not follow a random trajectory but is rather a targeted probe,” they write. Such a probe may have been intentionally sent for a “reconnaissance mission into the inner region of the solar system,” Loeb said in an email to Universe Today.

It’s possible the object is propelled through space through some naturally occurring phenomenon. Or it could be an extraterrestrial spacecraft that relies on an “artificial” light sail that relies on solar radiation pressure to generate propulsion, according to Loeb. 

“There is data on the orbit of this object for which there is no other explanation. So we wrote this paper suggesting this explanation,” Loeb told the Boston Globe. “The approach I take to the subject is purely scientific and evidence-based.”

Andrew Siemion, director of the Berkeley SETI Research Center, called the paper “intriguing” in an email to the Globe.

“Observational anomalies like we see with Oumuamua, combined with careful reasoning, is exactly the method through which we make new discoveries in astrophysics — including, perhaps, truly incredible ones like intelligent life beyond the Earth,” he wrote.  

But SETI senior astronomer Seth Shostak said in an email to NBC that “one should not blindly accept this clever hypothesis when there is also a mundane explanation for ’Oumuamua — namely that it’s a comet or asteroid from afar.”

’Oumuamua is the first interstellar object ever observed in the solar system. Now it’s hurtling away and may never be seen again. 


          El posible ovni-asteroide del que todo el mundo habla      Cache   Translate Page      

Un asteroide está desatando el cachondeo en redes sociales porque se especula con que pueda ser parte de una nave alienígena.

Astrónomos del Centro Harvard-Smithsonian de Astrofísica creen que la forma alargada del primer asteroide interestelar conocido, Oumuamua, opera como una vela que explica su inesperada aceleración. Esto podría suponer que se trata, entre otras opciones, de tecnología extraterrestre.

Los científicos especulan con un origen artificial del objeto, diseñado para un viaje de reconocimiento interestelar por una civilización avanzada, pero cuya misión ya ha terminado y se ha convertido en el desecho de un naufragio.

El estudio "¿Podría la presión de la radiación solar explicar la aceleración peculiar de Oumuamua?", publicado en arXiv, fue realizado por Shmuel Bialy, investigador postdoctoral en el Instituto de Teoría y Computación, y Abraham Loeb, director de este centro.

Oumuamua fue visto por primera vez por la encuesta Pan-STARRS-1 40 días después de su paso más cercano al Sol (el 9 de septiembre de 2017). En este punto, estaba a aproximadamente a 0,25 UA del Sol (un cuarto de la distancia entre la Tierra y el Sol), y ya estaba saliendo del Sistema Solar.

En ese momento, los astrónomos notaron que parecía tener una alta densidad (indicativa de una composición rocosa y metálica) y que estaba girando rápidamente.

Si bien no mostró signos de desgasificación al pasar cerca de nuestro Sol (lo que habría indicado que era un cometa), un equipo de investigación pudo obtener espectros que indicaban que Oumuamua estaba más helado de lo que se pensaba.

Cuando comenzó a abandonar el Sistema Solar, el Telescopio Espacial Hubble tomó algunas imágenes finales de Oumuamua que revelaron comportamientos inesperados. Otro equipo de investigación descubrió que Oumuamua había aumentado en velocidad, en lugar de disminuirla como se esperaba.

¿Por qué no experimentó desgasificación ante nuestro Sol?

La explicación más probable era que Oumuamua estaba descargando material de su superficie por el calentamiento solar (desgasificación). La liberación de este material, consistente con la forma en que se comporta un cometa, le daría al asteroide el empuje que necesitaba para lograr aumentar su velocidad.

A esto, Bialy y Loeb ofrecen una contraexplicación. Si Oumuamua era en realidad un cometa, ¿por qué entonces no experimentó desgasificación cuando estaba más cerca de nuestro Sol? Además, citan otras investigaciones que mostraron que, si la desgasificación fuera responsable de la aceleración, también habría provocado una rápida evolución en el giro de Oumuamua (que no se observó).

Básicamente, consideran la posibilidad de que Oumuamua podría ser, de hecho, una vela ligera, una forma de nave espacial que depende de la presión de radiación para generar propulsión, similar a lo que está desarrollando Breaktrough Starshot, el proyecto para enviar pequeñas naves a otros sistemas.

Así lo explica Loeb a Universe Today por correo electrónico:

"Explicamos el exceso de aceleración de Oumuamua lejos del Sol como resultado de la fuerza que la luz del Sol ejerce sobre su superficie. Para que esta fuerza explique el exceso de aceleración medida, el objeto debe ser extremadamente pequeño, del orden de una fracción de milímetro de espesor pero de decenas de metros de tamaño. Esto hace que el objeto sea liviano para su área de superficie y le permite actuar como una vela ligera. Su origen podría ser natural (en el medio interestelar o discos protoplanetarios) o artificial (como una sonda enviada para una misión de reconocimiento en la región interior del Sistema Solar)".

Basándose en esto, Bialy y Loeb calcularon la probable forma, grosor y relación masa-área que tendría un objeto tan artificial. También intentaron determinar si podría sobrevivir en el espacio interestelar, y si podría o no resistir las tensiones de tracción causadas por la rotación y las fuerzas de marea.

Lo que encontraron fue que una vela con solo una fracción de milímetro de espesor sería suficiente para que una lámina de material sólido sobreviviera el viaje a través de toda la galaxia, aunque depende en gran medida de la densidad de masa. Gruesa o delgada, esta vela podría soportar colisiones con granos de polvo y gas que impregnan el medio interestelar, así como fuerzas centrífugas y de marea.

En cuanto a lo que estaría haciendo una vela ligera extraterrestre en nuestro Sistema Solar, Bialy y Loeb ofrecen algunas explicaciones posibles para eso.

Sugieren que la sonda puede ser realmente una vela difunta que flota bajo la influencia de la gravedad y la radiación estelar, similar a los desechos de los naufragios de barcos que flotan en el océano. Esto ayudaría a explicar por qué Breakthrough Listen no encontró evidencia de transmisiones de radio.


          Whats new on arXiv      Cache   Translate Page      
Accelerating System Log Processing by Semi-supervised Learning: A Technical Report There is an increasing need for more automated system-log analysis …

Continue reading


          “Space”də maaş problemi – aparıcıları efirə çıxmır, arxiv işçisi xəbərləri təqdim edir      Cache   Translate Page      
“Space TV uzun aylardır işçilərinin maaşını verə bilmir. Maaşsız qalan TV-nin xəbər aparıcıları efirə çıxmırlar və ya kanaldan gediblər. Xəbərlər programını kanalın arxiv işçisi Sevinc Mehbalıyeva təqdim edir”. 24saat.org xəbər verir ki, bu barədə jirnalist Turxan Qarışqa məlumat paylaşıb. Telekanalda maaş probıemi olduğu barədə daha br jurnalist də məlumat yayıb. Jurnalist Famil Fərhadoğlu feysbuk səhifəsində […]
          Astrónomos de Harvard postularon que el misterioso objeto interestelar Oumuamua es una nave extraterrestre      Cache   Translate Page      


Desde que en octubre de 2017 Robert Werylk descubrió a Oumuamua cruzando el Sistema Solar, a 30.000.000 de kilómetros de la Tierra, distintas hipótesis han tratado de explicar qué es ese objeto interestelar que captó el telescopio Pan-STARRS 1. Se lo clasificó como un cometa, con el nombre de C/2017 U1; tras confirmar que no presentaba actividad, se lo consideró un asteroide y se lo bautizó A/2017 U1.

Un nuevo estudio presenta la posibilidad de que se trate una nave espacial extraterrestre.
Una investigación de la Universidad de Harvard, que se publicará el 12 de noviembre en The Astrophysical Journal Letters, propone que el objeto "puede ser una sonda enviada intencionalmente a la vecindad de la Tierra por una civilización alienígena", anticipó NBC.El trabajo de Avi Loeb, titular del Departamento de Astronomía de la institución, y Shmuel Bialy, investigador del Centro de Astrofísica Harvard-Smithsonian, no afirma directamente que los extraterrestres enviaron la presunta nave. "Pero luego de un análisis cuidadoso del modo en que el objeto interestelar aceleró mientras pasó el Sol a toda velocidad, dicen que Oumuamua podría ser una nave espacial movida por el espacio mediante la luz que cae en su superficie", señaló el medio.
Loeb y Bialy lo describieron así: "Si consideramos un origen artificial, una posibilidad es que Oumuamua sea una vela solar, que flota en el espacio interestelar como escombros de algún equipo tecnológico avanzado. De manera alternativa, un escenario más exótico es que Oumuamua sea una sonda totalmente operativa", citó Universe Today.
La gran velocidad y la trayectoria de este objeto rojizo indican, según los científicos, que no pertenece al Sistema Solar. Pero su forma chata y estirada, que se ha comparado con un cigarro, no se ajusta a cuerpos o fenómenos conocidos.
Es imposible estimar el propósito detrás de Oumuamua sin más datos", dijo Loeb a NBC. Una posibilidad es que estuviera flotando cuando el Sistema Solar la chocó, "como un barco que topa con una boya en la superficie del océano".

Loeb es un experto en velas solares, también llamadas velas de fotones: son instrumentos lanzados al espacio (por ahora, que se sepa, por los terrícolas) que consisten en una gran superficie compuesta por láminas reflectante muy ligeras —de ahí su nombre de vela— que usan la radiación solar para impulsarse. En su opinión, agregó, Oumuamua sería una "exótica".
"Explicamos la excesiva aceleración de Oumuamua alejándose del Sol como resultado de la fuerza que la luz solar ejerce sobre su superficie", dijo Universe Today. "Para que esta fuerza explique la aceleración medida, el objeto debe ser extremadamente delgado, como una fracción de milímetro de grosor, pero con decenas de metros de extensión. Esto lo hace liviano en relación a su superficie, y le permite actuar como una vela solar. Su origen podría ser natural o artificial".
Aunque algunos colegas los criticaron por la falta de pruebas, ya que su hipótesis es especulativa, Loeb se defendió ante NBC argumentando que precisamente en la falta de evidencias se basa su trabajo: "Sigo la máxima de Sherlock Holmes: cuando se ha descartado lo imposible, lo que queda, por improbable que sea, debe ser la verdad".

Un problema adicional para la investigación de Harvard es que Oumuamua ya se fue del Sistema Solar y no se puede ver con telescopios. Pero Loeb dijo que haber observado el objeto debería lanzar a los astrónomos a buscar otros similares.
Loeb agregó a Universe Today: "Oumuamua podría ser una muestra de tecnología extraterrestre que llegó para explorar nuestro Sistema Solar, del mismo modo que nosotros esperamos explorar Alpha Centauri utilizando Starshot y tecnologías similares". La alternativa podría ser "imaginar que Oumuamua estaba en una misión de reconocimiento".
INFOBAEZ


          Ο «περίεργος» αστεροειδής Ουμουαμούα ίσως τελικά να είναι εξωγήινο διαστημόπλοιο, λένε επιστήμονες του Χάρβαρντ      Cache   Translate Page      

Ένα ογκώδες, επίπεδο αντικείμενο το οποίο ταξιδεύει στο Διάστημα με «ασυνήθιστη επιτάχυνση» θα μπορούσε όντως να είναι εξωγήινο σκάφος, σύμφωνα με νέα μελέτη.

Ενδεχομένως να «είναι ένα πλήρως λειτουργικό ερευνητικό σκάφος που εστάλη σκοπίμως στην περιοχή της Γης από έναν εξωγήινο πολιτισμό», σύμφωνα με paper δύο επιστημόνων του Χάρβαρντ, που πρόκειται να δημοσιευτεί στις 12 Νοεμβρίου στο Astrophysical Journal Letters.

To αντικείμενο, το οποίο θεωρείται πως έχει προέλθει από το εξωτερικό του ηλιακού μας συστήματος, και είχε πρωτοεντοπιστεί από τηλεσκόπια στο Μάουι έναν χρόνο πριν, έχει προκαλέσει σύγχυση και ερωτηματικά στους επιστήμονες. Το όνομα που του δόθηκε είναι «Ουμουαμούα», που σημαίνει «ανιχνευτής» στα χαβανέζικα.

Λόγω του ασυνήθιστου σχήματος, που παραπέμπει σε πούρο, και της ταχύτητάς του, κάποιοι είχαν παρουσιάσει την υπόθεση ότι προέρχεται από κάποιον εξωγήινο πολιτισμό. Το αντικείμενο διερευνήθηκε για ραδιοκύματα, χωρίς να εντοπιστούν κάποια. Άλλοι επιστήμονες το χαρακτήρισαν κομήτη, παρά την έλλειψη ουράς.

Τώρα, ο καθηγητής Άβι Λεμπ, πρόεδρος του τμήματος αστρονομίας του Χάρβαρντ, και ο μεταδιδακτορικός Σμούελ Μπιάλι, θέτουν ξανά το ενδεχόμενο να είναι εξωγήινο σκάφος- ή πιθανώς κομμάτι ενός.

Ένα συγκεκριμένο περίεργο χαρακτηριστικό στην κίνηση του αντικειμένου «εξηγείται εύκολα αν ο Ουμουαμούα δεν ακολουθεί τυχαία πορεία, αλλά είναι αντ’αυτού διαστημόπλοιο με προορισμό» γράφουν σχετικά. Ένα τέτοιο σκάφος θα μπορούσε να είχε σταλεί σκοπίμως για μια «αναγνωριστική αποστολή στην εσώτερη περιοχή του Ηλιακού Συστήματος» ανέφερε ο Λεμπ σε email του στο Universe Today.

Είναι πιθανόν το αντικείμενο να κινείται στο Διάστημα μέσω κάποιου φυσικού φαινομένου. Ή θα μπορούσε να είναι ένα εξωγήινο σκάφος το οποίο κινείται χάρη σε ένα «ηλιακό πανί», που χρησιμοποιεί την ηλιακή ακτινοβολία για να ωθεί το σκάφος, σύμφωνα με τον Λεμπ.

«Υπάρχουν δεδομένα για την τροχιά αυτού του αντικειμένου, για τα οποία δεν υπάρχει άλλη εξήγηση. Οπότε γράψαμε αυτό το paper, υποδεικνύοντας αυτή την εξήγηση» είπε ο Λεμπ στη Boston Globe. «Η προσέγγισή μου σε αυτό το θέμα είναι καθαρά επιστημονική και βασισμένη σε στοιχεία».

Ο Άνριου Σίμιον, διευθυντής του Berkeley SETI Research Center, χαρακτήρισε το paper «ενδιαφέρον» σε email του στη Globe.

«Παρατηρησιακές ανωμαλίες, όπως αυτές που βλέπουμε με τον Ουμουαμούα, σε συνδυασμό με προσεκτικές λογικές διεργασίες, είναι ακριβώς ο τρόπος με τον οποίο κάνουμε νέες ανακαλύψεις στην αστροφυσική- περιλαμβανομένων, ίσως, πραγματικά απίστευτων, όπως η νοήμων ζωή πέρα από τη Γη» έγραψε σχετικά.

Ωστόσο ο Σεθ Σόστακ, αστρονόμος του SETI, έγραψε σε email στο NBC ότι «δεν θα έπρεπε να δεχτούμε τυφλά αυτή την έξυπνη υπόθεση, όταν υπάρχει επίσης μια απλή εξήγηση για τον Ουμουαμούα- ότι είναι ένας κομήτης ή αστεροειδής από μακριά».

Ο Ουμουαμούα είναι το πρώτο διαστρικό αντικείμενο που έχει εντοπιστεί ποτέ στο Ηλιακό Σύστημα. Αυτή τη στιγμή απομακρύνεται, και είναι το πιθανότερο είναι να μη θεαθεί ποτέ ξανά.

Μετάφραση/ αναδημοσίευση από την αμερικανική έκδοση της HuffPost 


          Uma nova “partícula fantasma” pode ter sido descoberta, desta vez pelo CERN      Cache   Translate Page      

Uma nova partícula, chamada até então de “partícula fantasma” pode ter sido detectada no Grande Colisor de Hádrons (Large Hadron Collider, ou apenas LHC), da Organização Europeia para a Pesquisa Nuclear (CERN, em inglês), na Suíça. A descoberta foi feita usando um instrumento conhecido como Compact Muon Solenoid (CMS) no acelerador de partículas.

Na ocasião, a equipe afirmou ter visto algo que poderia ser uma partícula, cuja massa é duas vezes a de um átomo de carbono. Contudo, o objeto não parece se encaixar em teorias conhecidas, o que está causando uma certa agitação no mundo da ciência. A pesquisa ainda não foi revisada por pares (isto é, checada no meio acadêmico), mas está disponível online.

Considerando os padrões atuais da física nuclear, esta nova partícula seria formada por múons, partículas que são semelhantes ao elétron, mas com massa 200 vezes maior. Ela também teria cerca de 1/4 da massa do Bóson de Higgs (uma partícula elementar que surgiu, em teoria, um pouco depois do Big Bang e que pode ajudar a explicar a origem do universo).

Um pesquisador da equipe que trabalhou com os dados coletados pelo CMS, Alexandre Nikitenko, disse ao The Guardian que “os teóricos estão empolgados e os experimentalistas estão muito céticos” com o suposto fenômeno físico. “Como físico, devo ser muito crítico, mas, como autor desta análise, devo ter algum otimismo também”, acrescentou.

Ainda poderá demorar para descobrir se essa partícula é real ou não. O Science Alert, inclusive, observa como é “estranho” o fato de uma massa ter se formado “onde nenhuma era esperada”. Porém, mesmo que não seja real, ela não exatamente é uma descoberta da física, afinal, esta não é a primeira notícia do gênero a surgir.

Em julho, astrônomos haviam anunciado a descoberta de neutrinos vindos de uma galáxia energética a cerca de 4 bilhões de anos-luz de distância da Terra. Ainda que seja um pouco diferente, não deixa de ser uma espécie de “partícula fantasma”. E em setembro, cientistas sugeriram terem “quebrado o Modelo Padrão” com detecção de neutrinos cósmicos de energia extremamente alta, usando o instrumento Anita.

Este Modelo Padrão é, como o nome sugere, um conjunto de fatores que contribuem para os conceitos tradicionais da Física de Partículas – e estas duas descobertas em questão não foram as únicas a desafiarem-no. Voltando um pouco, mais especificamente para março, o estranho “skyrimon”, uma partícula com propriedades semelhantes a uma esfera de raios (ou seja, correntes elétricas circulando dentro de uma bola de plasma), foi divulgada na comunidade.


          ABC by QMC      Cache   Translate Page      
A paper by Alexander Buchholz (CREST) and Nicolas Chopin (CREST) on quasi-Monte Carlo methods for ABC is going to appear in the Journal of Computational and Graphical Statistics. I had missed the opportunity when it was posted on arXiv and only became aware of the paper’s contents when I reviewed Alexander’s thesis for the doctoral […]
          英美等国家如何评估“量子通信”工程化      Cache   Translate Page      


为什么英美和日本等发达国家在"量子通信"的工程化和产业化进程中给人的印象是"起个大早赶个晚集"。关于这个问题一直以来都有不同的解读。本文将对英国情报部门的白皮书、美国空军的一份调研报告和日本科学家的一篇综述性论文作些介绍,为读者们分析上述问题提供一个全新的视角。"他山之石,可以攻玉。""愚者千虑,必有一得。"在作重大技术决策时,多听听别人的声音多了解外人的想法这大概不会错吧?
所谓的"量子通信"不是一种新的通信技术,它甚至也不是一种新的密码技术,其实目前宣传的"量子通信"技术仅是密码技术中"密钥分发"的一种新方法在正规严肃的文件和论文中都把它称为"量子密钥分发技术"QKD(Quantum key distribution)。本文在大多数情况下都会使用QKD这个标准术语。
英国情报部门所属国家网络安全中心发布的白皮书
2016年10月,隶属于英国情报安全总部(GCHQ)的国家网络安全中心(NCSC)发布了一份白皮书[1],建议撤销量子密钥分发技术(QKD)的发展。
白皮书第一部份主要分析了QKD的技术局限性
1)QKD没有能力解决大部分的通信安全问题
QKD协议仅是密钥分发协商的一种新机制,供通信双方在数据加密解密时使用。而现代通信要求提供身份验证、数据完整性证明、网络信道建立、访问控制和自动软件更新等多样化的安全服务,通信安全更依赖于身份验证和完整性证明,而不仅是加密解密。
QKD技术不能取代传统公钥密码的灵活有效的认证机制。物联网(IoT)、大数据应用、社交媒体和云服务这些新技术对通信安全提出了更多更新的要求,对这一系列新的挑战QKD更是无能为力。
2)QKD系统在应用方面受到许多限制
相对较短的有效传输范围,以及BB84和其他类似协议都是点对点协议,这是QKD技术的两个致命弱点。这意味着QKD很难与互联网、移动互联网集成和融合。
一些研究人员试图将QKD通过"可信节点"与经典网络设备集成来解决这些问题。但这会立即使任何基于量子力学定律得来的"量子安全保证"归于无效,并且这些辅助网络设备会带来一系列新的安全隐患。
3)QKD系统工程不太可能具有经济效益
QKD本质上是一种纯粹的硬件方案,而硬件的获得和维护都相对昂贵。与传统密码使用的软件方案完全不同,硬件在升级或发现漏洞时无法作远程修补以降低维护成本。
白皮书第二部份着重分析了QKD的安全性
任何真实的QKD系统都将使用经典组件构建,例如光源、探测器、光纤、以及潜在的辅助经典网络设备,它们都可能存在安全的隐患。
已经在QKD的示范系统上进行了多种黑客攻击实验,这些攻击最后控制了系统中一个或多个硬件组件,从而在不触发警报的情况下获取了共享密钥。
与现代互联网或移动网络技术相比,QKD可能更难经受得住"拒绝服务"(DoS)的攻击。
QKD面临着一系列潜在的黑客攻击风险,对于这些攻击的细节和危害性我们尚未有充分的理解。英国目前对真实世界的QKD系统的漏洞缺乏系统深入的研究。应该鼓励开展这方面的研究,以建立一个有关攻击和防护QKD系统的全面完整的知识体系。
还应进一步研究如何准确评估工程设备的安全性,为QKD工程系统的安全性评估开发出量化的手段和标准。
白皮书第三部份主要分析了替代QKD的方案
学术界和工业界对开发"量子安全"或"后量子"(经典)公钥机制有了新的兴趣。而现有的公钥方案RSA、DSA和ECDH在未来大型量子计算机出现后可能会变得不安全,它们将被新一代的后量子公钥机制直接替代。
新的共识是,实现量子时代通信安全的最佳方案是:在采用后量子公钥密码技术的基础上,对目前使用的传统密码和基于分组交换的通信协议实施稳妥有序的更新和升级。后量子公钥密码技术的软件或固件应该更容易开发、部署和维护,工程项目的全周期总体成本会更低,并且比基于QKD的解决方案具有更高的安全性。
鉴于QKD仅只提供密码系统中密钥分发的功能,而现实世界的通信安全就仍然要依赖于传统公钥机制以满足设备和用户认证,以及软件更新等多种要求。对于现代通信安全而言,后量子公钥密码技术相比QKD应用面更广阔、性能更优越。因此,无论有无QKD,开展后量子公钥密码技术的研究对于保证未来网络安全都是必不可缺的。
白皮书第四部份作出了关于增强通信安全性的应对措施和方案
考虑到上述列举的所有实际情况、利弊得失和安全原因,我们的决定是:
不支持在任何政府或军事用途中釆用QKD方案
建议在商业应用中不要用QKD替换任何现有的公钥解决方案
对QKD的科学研究应该继续。但是,应该开展更多的对QKD工程的安全漏洞的研究,正反两个方向的研究必须同时进行且保持平衡。为客观检验QKD工程系统安全性,必须开发出量化的手段和标准。负责任的创新必须伴随独立的验证。
上述决定不会有改变,除非满足以下条件:
在实际QKD漏洞研究中获得经验并取得可量化的安全性验证手段基础上,建立起健全的QKD的商业标准;
商用QKD系统的全工程周期的总体成本可以有更可靠的估算方法。
我们鼓励研究开发后量子公钥密码技术,这是保护通信系统免受未来量子计算机威胁的更实际和更具成本效益的应对方案。尽管需要向后量子公钥方案过渡,但我们认为对当前系统升级的需要并不紧迫。一个稳妥且经过深思熟虑的升级过程将使研究人员有足够时间对最佳的后量子公钥协议达成共识。
白皮书的最后部分是结论
对量子密钥分发技术评估的结果是:
     该技术在工程实施中存在诸多障碍和瓶颈;
     该技术对于解决许多安全隐患无能为力;
     人们对该技术潜在的安全隐患仍知之甚少。
相比之下,为了应对未来量子计算机的安全威胁,后量子公钥密码技术可以为现实世界中的通信安全提供更有效的解决方案。
美国空军科学顾问委员会关于量子信息技术的调研报告
无独有偶,2016年,美国空军科学顾问委员会(SAB)就量子信息技术的潜在影响进行深入的调研后形成了一份报告,该委员会是由50名科学家和研究人员组成的独立的联邦咨询委员会[2]。
SAB主席、美国空军前首席科学家达姆(Werner Dahm)在该报告问世前向外界透露:"评估研究进行了大约三分之一,很明显发现这个领域有很多的炒作。有不少人翻来覆去地说量子信息技术可能会出现奇迹"他说。"这些技术可能具有巨大的潜力,但是实际上更多的是炒作,它们并没有实际效用。"
该委员会资深成员兼技术和国家安全计划主任菲茨杰拉德(Ben FitzGerald)表示:量子信息是"下一代的下一代技术"的一部分,它对国防安全产生的影响可能还在遥远的未来。
该报告对外公布的摘要中的第四点非常明确指出:量子密钥分发技术显著增加了系统的复杂性,但不太可能在整体上增强通信安全性,它与最佳经典替代方案相比不具有什么优势。
量子通信效果不佳,SAB得出结论认为在该领域应该将资金用于其他技术的开发。
"这个结论对我来说非常令人惊讶,但这就是我们做这些[研究]的原因,"达姆又说。
以上是英国政府情报部门和美国军方对量子密钥分发技术的政策评估的主要内容。
我的一些看法。
1)这两份报告的结论是一致的,他们都认为量子密钥分发技术(即所谓的"量子通信")对于保护现代通信安全没有战略实用价值,真正值得努力发展的应该是基于数学原理依靠软件实现的后量子密码技术。
2)这两份报告的结论当然值得关注。但是白皮书中分析和评估的方法比最后的结论更有价值。分析比结论更重要,掌握了科学化的数据采集和分析评估方法,有关部门就可根据自身国情独立地作出与时俱进的科学决策。
3)量子密钥分发技术的安全性分析是整个决策评估的关键。必须明白QKD的安全性在基础理论、应用技术和工程实施三个层面上是完全不一样的,这是引起许多分歧和误解的主要原因。量子密钥分发技术在理论上(在书本上)有可能是无条件绝对安全的;但是在应用技术层面(在实验室中)存在许多不安全因素;到了工程项目中(在实际使用中)它的安全性目前不如传统密码技术。希望媒体在宣传量子通信技术时再也不要用"无条件绝对安全"这种过份夸张的词汇来误导大众了,因为这不是科学事实,它过去不是、现在不是、将来也不是。对此我另有专文详细分析介绍。
4)英美情报和军方认定量子密钥分发技术在技术层面上存在许多不安全因素,我觉得这个结论与国际学术界的研究结果是一致的。美国和日本的量子信息领域的教授专家们就QKD的安全性在IEEE等期刊上发表了不少论文。在工程和应用技术层面,IEEE期刊往往比《自然》等杂志更具权威性。下面是2017年日本一位量子信息学教授的论文的结论[3]:
【量子密钥分发(QKD)吸引了许多研究人员,因为自1984年发明以来这是一种可行的分发密钥的方式。然而,在2009年,H. P. Yuen对其安全性产生了疑问,然后O. Hirota,K Kato,T. Iwakoshi 接着完成了Yuen的研究,并解释了为什么Yuen的说法是正确的。隨后,Yuen自己在2016年又发表了一篇论文,总结了他提出的批评,指出了QKD的安全要求与实际情况之间的差距。本文诠释了Yuen详细指出的问题,以及有关不同于QKD的其他协议研究的最新趋势。实际上,QKD的本身具有重要意义,它是使用量子力学保护通信安全的尝试的第一步。然而,Yuen已经澄清,QKD的安全性证明仍有问题,并且除了QKD之外他还提出了其他选择。作者认为,在QKD研究中获得的知识对其它协议也是有用的。如果研究人员仍然继续研究QKD,作者希望他们能够非常认真地对待Yuen的批评。】
5)英国情报部门没有把量子密钥分发技术的门全部关闭。注意,沒有战略价值不等于沒有科研价值,QKD的科学研究应该继续。白皮书强调QKD技术的研究应该正反两个方向同时进行,寻找和攻击QKD的安全漏洞应该放在更重要的位置。而且攻防研究必须有二个独立的团队来实施,这二个团队的经费来源、隶属关系必须完全独立,就像实战演练中拼力厮杀的红军和蓝军一样。英国不愧是老牌帝国主义,考虑问题确实周到老辣。
6)非常明显英美更关心和重视的是量子计算机而不是"量子通信"。如果说量子计算机是破坏通信安全之矛,那么量子密钥分发技术QKD就是保护通信安全之盾。如果认为QKD技术是无条件绝对安全的盾,英美明知对手有这种盾而自己却没有,但他们对此却不屑一顾,却天天使尽力气打造那支对敌无用、对己有害的矛,脑子不是真的进水了?世上难道真有如此愚蠢的阴谋家?这里只有一个合理解释,那就是英美心知肚明,量子密钥分发技术并不是无条件绝对安全的盾。
有一点需要声明,我只是借用阴谋论作逻辑推理,绝不是要证实阴谋论存在。对于各种阴谋论的夸大宣传我一直是反感的。在诸如量子计算机、"量子通信"和后量子密码学这些科学研究的前沿领域,各国科学家的合作远大于竞争。这其实正是我撰写此文的原因和基础。
[1] 英国情报部门所属国家网络安全中心(NCSC)发布的白皮书
[2] 美国空军科学顾问委员会关于量子信息技术分析报告(对外发布的摘要)
[3] 日本量子信息技术专家的论文:量子密钥分发技术安全性的批判与展望

          Una stella vecchia quasi quanto l'Universo      Cache   Translate Page      

Una stella vecchia quasi quanto l'Universo

IL NOME non è semplicissimo da ricordare, il motivo per cui è salita agli onori delle cronache (astronomiche) decisamente sì. Si chiama 2MASS J18082002–5104378 B ed è una delle stelle più vecchie mai scoperte. Un astro, per intenderci, nato agli albori dell'Universo, con un'età di circa 13,5 miliardi di anni; di quelli che, a detta degli scienziati che l'hanno scovata e studiata, se ne trovano uno su dieci milioni. Prezioso per raccontare la storia delle prime stelle che popolarono l'universo. Il racconto del vecchissimo astro è disponibile in preprint sulle pagine di arXiv e verrà pubblicato su Astrophysical Journal
 
DENTRO LE FORNACI STELLARI
La materia di cui è fatta la stella è quella, raccontano gli autori, fuoriuscita direttamente o quasi dal Big Bang, che seminò l'universo primordiale di elementi leggeri: idrogeno, elio, e un po' di litio. I materiali pesanti sarebbero venuti dopo, prodotti proprio dalle prime stelle, nati nelle fornaci dei loro nuclei, sparsi nell'Universo dopo che queste esplosero come supernovae, un evento approssimabile a una sorta di morte stellare. Le stelle infatti sono dei grandi laboratori di chimica, in cui la produzione degli elementi avviene per reazione nucleare, un processo noto con il termine di nucleosintesi, dove nuclei più leggeri vengono fusi insieme a creare nuclei più pesanti, come quelli dei metalli. Tutto questo, ricordano dalla Nasa, avviene solamente in presenza di collisioni a velocità elevatissime, che a loro volta avvengono solo con temperature elevate. In questo processo il contenuto di metalli viene considerato una misura dell'età della stella: minore è il suo contenuto tanto più precocemente si è formata. Invece via via che i cicli di fusione e morte stellare si susseguono l'abbondanza di metalli delle stelle aumenta.

UNA PICCOLA E VECCHIA STELLA
2MASS J18082002–5104378 B è una piccola stella, quasi invisibile riferiscono i ricercatori, parte del sistema binario 2MASS J18082002–5104378, a basso contenuto di metalli ed ha una massa pari a circa il 14% quella solare. Per dedurre la sua composizione i ricercatori, guidati da Kevin Schlaufman della Johns Hopkins University (Usa) hanno analizzato gli spettri della stella ottenuti dagli strumenti dei Magellan Telescopes e del Very Large Telescope in Cile. Gli scienziati sono così riusciti a stabilire che la stella – che si trova nel disco sottile della nostra Via Lattea, dove si trova anche il Sole - ha sia una piccola massa che un basso contenuto di metalli. Ha “meno grammi di elementi pesanti di qualsiasi altra stella conosciuta”, scrivono gli autori.
 
QUANTE STELLE COME QUESTA?
“Quanto osservato è importante perché per la prima volta siamo stati in grado di fornire prova diretta che stelle molto vecchie e con bassa massa esistono e possono sopravvivere fino ai giorni nostri senza distruggersi”, ha commentato Andrew Casey della Monash University (Australia). L'idea infatti fino a tempi recenti era che solo stelle massicce avrebbero potuto fermarsi agli albori dell'Universo, ma che bruciando tutto il loro carburante e quindi morendo non potessero essere osservate. Quanto osservato invece suggerisce che altre piccole stelle con bassi contenuti di metalli possano esistere nell'universo ed essere sopravvissute. Resta solo da scovarle.


          Científicos dicen que el misterioso Oumuamua podría ser una nave espacial alienígena      Cache   Translate Page      
Alien, extraterrestre, ovni

El misterioso ‘Oumamua, el primer visitante de origen interestelar descubierto, podría comportarse como un velero solar, ya que pudo haberse visto propulsado por una fuerza conocida como presión de radiación al acercarse a nuestro sol. Es lo que sugieren los cálculos de dos astrónomos del Centro de Astrofísica Harvard-Smitshonian, que han publicado sus resultados en la plataforma de prepublicación arXiv.org y los han enviado a la revista The Astrophysical Journal Letters. Según los autores, sus conclusiones son compatibles con que ‘Oumuamua sea en realidad una nave extraterrestre dirigida al sistema solar para explorarlo.

Desde su descubrimiento en octubre de 2017 por la Universidad de Hawái, el extraño objeto interestelar, que cruzó nuestro sistema solar a toda velocidad, sigue alimentando el debate entre los astrónomos. Al principio, parecía que lo más probable era que ‘Oumuamua fuese un cometa, ya que estos objetos con abundante hielo son mucho más comunes que los asteroides, en los que predominan la roca y la materia orgánica.

Sin embargo, no presentaba la característica como de los cometas, un halo de hielo y polvo que expulsan cuando se calientan al acercarse a una estrella y que a menudo adopta forma de cola. Ni si quiera la mostró cuando pasó a tan sólo 37 millones de kilómetros del sol, por dentro de la órbita de Mercurio. Por eso, y por el color de su superficie, que denotaba la presencia de materia orgánica, los astrónomos consideraron más probable que se tratase de un asteroide, o de un nuevo tipo de objeto a medio camino entre un asteroide y un cometa.

El extraño comportamiento de ‘Oumuamua llevó también a algunos científicos a preguntarse si podría ser un objeto creado por seres inteligentes de otro sistema. Para averiguarlo, en diciembre de 2017 la iniciativa Breakthrough Listen escuchó al objeto en un amplio espectro de frecuencias de radio, pero no halló ninguna emisión.

Espacio, estrellas, infinito

Una nueva investigación liderada por la Agencia Espacial Europea (ESA) y publicada el pasado junio pareció zanjar el debate. Los astrónomos detectaron que, cuando ‘Oumuamua pasó por el punto más cercano al sol, experimentó una aceleración independiente de toda atracción gravitatoria. Los autores de este trabajo dedujeron que la aceleración debía ser producto de una coma, que ‘Oumuamua había salido impulsado por los gases que expulsó cuando se calentó por su proximidad al sol. La presencia de una coma implica que ‘Oumuamua debería ser un cometa. La coma podría haber sido tan pequeña que pasara inadvertida a los telescopios, según declaró a Big Vang el autor principal del estudio, Marco Micheli, del centro de seguimiento de objetos cercanos a la Tierra de la ESA.
[post_ad]
Ahora, los astrónomos del Centro de Astrofísica Harvard-Smitshonian proponen que la aceleración de ‘Oumuamua al acercarse al sol se debiese no a una coma, sino a la presión de radiación. Se trata de una fuerza que ejercen los fotones de la luz sobre todos los objetos, muy débil, pero que puede llegar a generar potentes aceleraciones si actúa sobre una superficie muy amplia. Según los autores del trabajo, esta fuerza cuadra más con el movimiento de ‘Oumuamua, ya que el impulso de una coma lo habría hecho rotar mucho más rápido de lo que revelaron las observaciones.

Los investigadores proponen que ‘Oumuamua podría tener la forma de las velas solares, que son láminas muy finas pero con una gran superficie. Las velas solares se investigan actualmente en la Tierra como medio de propulsión para viajes interestelares, en los que el impulso mediante combustible resultaría inviable. Ya existen naves que han probado este tipo de tecnología en misiones interplanetarias, como la japonesa IKAROS. Por otra parte, la iniciativa Breakthrough Starshot, de la que uno de los autores del estudio, Abraham Loeb, es asesor científico, se ha propuesto enviar un velero solar a la estrella más cercana a nuestro sistema, Proxima Centauri.

VIDEO Oumamua

De tener realmente forma de vela solar, los investigadores no descartan que ‘Oumamua sea igualmente de origen natural, aunque no se conoce ningún objeto con estas características. Si se tratara, por el contrario, de un objeto artificial, podría ser un residuo tecnológico a la deriva. O, “alternativamente, un escenario más exótico es que ‘Oumuamua sea una nave plenamente operativa enviada intencionadamente a las inmediaciones de la Tierra por una civilización extraterrestre”, escriben los astrónomos en el artículo. Esta posibilidad, especulan, podría explicar la extraña órbita de ‘Oumuamua, que lo llevó a acercarse al sol y a la Tierra antes de alejarse del centro del sistema solar a toda velocidad.

Redacción--noticia.tn/vanguardia
          Cientistas de Harvard sugerem estranho objeto interestelar Oumuamua pode ser uma vela solar alienígena      Cache   Translate Page      
Em 19 de outubro de 2017, o Panoramic Survey Telescope and Rapid Response System-1 (Pan-STARRS-1) no Havaí anunciaram a primeira detecção de um asteroide interestelar, denominado 1I/ 017 U1 (AKA 'Oumuamua).


Nos meses que se seguiram, foram realizadas várias observações de acompanhamento que permitiram aos astrónomos obterem uma melhor ideia do seu tamanho e forma, ao mesmo tempo que revelavam que possuíam as características de um cometa e de um asteróide.

Curiosamente, também houveram especulações de que, com base em sua forma, 'Oumuamua pode realmente ser uma espaçonave interestelar (Breakthrough Listen até a monitorou em busca de sinais de sinais de rádio!).

Um novo estudo de um par de astrônomos do Harvard Smithsonian Center for Astrophysics (CfA) deu um passo adiante, sugerindo que 'Oumuamua pode realmente ser uma vela leve de origem extraterrestre.


O estudo - "A pressão da radiação solar poderia explicar a aceleração peculiar de Oumuamua?', Que apareceu recentemente online - foi conduzida por Shmuel Bialy e Abraham Loeb. Enquanto Bialy é pesquisador de pós-doutorado no Instituto de Teoria e Computação (ITC) do CfA, o Prof. Loeb é o diretor do ITC, o Professor de Ciência Frank B. Baird Jr. da Universidade de Harvard e o presidente do Comitê executivo do Breakthrough Starshot.

Para recapitular, 'Oumuamua foi visto pela primeira vez pela pesquisa Pan-STARRS-1 40 dias depois de ter feito a sua passagem mais próximo ao Sol (em 9 de setembro de 2017).

Neste ponto, era cerca de 0,25 UA do Sol (um quarto da distância entre a Terra e o Sol) e já estava saindo do Sistema Solar. Naquela época, os astrônomos notaram que ela parecia ter uma alta densidade (indicativa de uma composição rochosa e metálica) e que estava girando rapidamente.

Embora não mostrasse sinais de liberação de gás quando passou perto do nosso Sol (o que teria indicado que era um cometa), uma equipe de pesquisa conseguiu obter espectros que indicavam que 'Oumuamua era mais gelado do que se pensava anteriormente.

Então, quando começou a deixar o Sistema Solar, o Telescópio Espacial Hubble tirou algumas imagens finais de 'Oumuamua que revelaram algum comportamento inesperado.

Depois de examinar as imagens, outra equipe de pesquisa internacional descobriu que 'Oumuamua tinha aumentado em velocidade, ao invés de desacelerar como esperado. A explicação mais provável, eles alegaram, era que 'Oumuamua estava liberando material de sua superfície devido ao aquecimento solar.

A liberação deste material, que é consistente com o comportamento de um cometa, daria ao Oumuamua o impulso constante necessário para alcançar esse aumento na velocidade.

Para isso, Bialy e Loeb oferecem uma contra-explicação. Se 'Oumuamua era de fato um cometa, por que então não experimentava a liberação de gases quando estava mais perto do nosso Sol?

Além disso, eles citam outras pesquisas que mostraram que, se a liberação de gás fosse responsável pela aceleração, também teria causado uma rápida evolução na rotação de 'Oumuamua (o que não foi observado).

Basicamente, Bialy e Loeb consideram a possibilidade de que 'Oumuamua poderia de fato ser uma vela leve, uma forma de espaçonave que depende da pressão de radiação para gerar propulsão - similar ao que ao que a iniciativa Breakthrough Starshot está fazendo. Semelhante ao que está planejado para Starshot, esta vela leve pode ter sido enviada de outra civilização para estudar o nosso Sistema Solar e procurar por sinais de vida. Como o Prof. Loeb explicou ao Universe Today via e-mail:

"Nós explicamos o excesso de aceleração de Oumuamua para longe do Sol como resultado da força que a luz solar exerce sobre sua superfície. Para que esta força explique o excesso de aceleração medida, o objeto precisa ser extremamente fino, de ordem uma fração de milímetro de espessura, mas com dezenas de metros de tamanho, o que torna o objeto leve para sua área de superfície e permite que ele atue como uma vela de luz. Sua origem pode ser natural (no meio interestelar ou discos proto-planetários) ou artificial ( como uma sonda enviada para uma missão de reconhecimento na região interna do Sistema Solar)."

Com base nisso, Bialy e Loeb calcularam a provável forma, espessura e relação massa-área que um objeto artificial teria. Eles também tentaram determinar se este objeto seria capaz de sobreviver no espaço interestelar, e se ele suportaria ou não as tensões de tração causadas pelas forças de rotação e de maré.

O que eles descobriram foi que uma vela de apenas uma fração de milímetro de espessura (0,3-0,9 mm) seria suficiente para uma folha de material sólido sobreviver à jornada por toda a galáxia - embora isso dependa muito da densidade de massa de Oumuamuam.

Grossa ou fina, esta vela seria capaz de resistir a colisões com grãos de poeira e gás que permeiam o meio interestelar, bem como forças centrífugas e de maré.

(Wikimedia Commons / Andrzej Mirecki)

Quanto ao que uma vela de luz extraterrestre estaria fazendo em nosso Sistema Solar, Bialy e Loeb oferecem algumas explicações possíveis para isso. Primeiro, eles sugerem que a sonda pode realmente ser uma vela desativada flutuando sob a influência da gravidade e da radiação estelar, semelhante a detritos de naufrágios flutuando no oceano. Isso ajudaria a explicar porque o Breakthrough Listen não encontrou evidências de transmissões de rádio.

Loeb ainda ilustrou essa ideia em um artigo recente que escreveu para a Scientific American, onde ele sugeriu que 'Oumuamua poderia ser o primeiro caso conhecido de uma relíquia artificial que flutuava em nosso Sistema Solar a partir do espaço interestelar.

Além do mais, ele observa que as velas de dimensões semelhantes foram projetadas e construídas por humanos, incluindo o projeto IKAROS projetado em japonês e a Iniciativa Starshot com a qual ele está envolvido.

"Esta oportunidade estabelece uma base potencial para uma nova fronteira da arqueologia espacial, ou seja, o estudo de relíquias de civilizações passadas no espaço", escreveu Loeb.

"A descoberta de evidências de lixo espacial de origem artificial forneceria uma resposta afirmativa à antiga questão 'Estamos sozinhos?'. Isso teria um impacto dramático em nossa cultura e acrescentaria uma nova perspectiva cósmica ao significado da atividade humana".

Por outro lado, como Loeb disse ao Universe Today, 'Oumuamua pode ser uma peça ativa da tecnologia alienígena que veio explorar nosso Sistema Solar, da mesma forma que esperamos explorar Alpha Centauri usando Starshot e tecnologias similares:

"A alternativa é imaginar que 'Oumuamua estava em uma missão de reconhecimento. A razão pela qual eu contemplo a possibilidade de reconhecimento é que a suposição de que' Oumumua seguisse uma órbita aleatória requer a produção de ~ 10^{15} de tais objetos por estrela em nossa galáxia. Essa abundância é até cem milhões de vezes maior do que a esperada do Sistema Solar, com base em um cálculo que fizemos em 2009. Uma superabundância surpreendentemente alta, a menos que `Oumuamua seja uma sonda específica em uma missão de reconhecimento e não um membro de uma população aleatória de objetos ".

De acordo com Loeb, há também o fato de que a órbita de Oumuamua a trouxe para 0,25 UA do Sol, que é uma boa órbita para interceptar a Terra sem sofrer muita irradiação solar. Além disso, chegou a 0,15 UA da Terra, o que poderia ter sido o resultado de correções orbitais projetadas para facilitar um sobrevoo.

Alternadamente, ele afirma que é possível que centenas de tais sondas possam ser enviadas para que uma delas chegue perto o suficiente da Terra para estudá-la. O fato de que a pesquisa Pan STARRS-1 quase não detectou 'Oumuamua em sua aproximação mais próxima poderia ser visto como uma indicação de que há muitos outros objetos que não foram detectados, reforçando o argumento de que' Oumuamua é uma de muitas dessas sondas.

Considerando que os astrônomos concluíram recentemente que nosso Sistema Solar provavelmente capturou milhares de objetos interestelares como 'Oumuamua, isso abre a possibilidade de futuras detecções que poderiam ajudar a provar (ou refutar) o caso de uma vela interestelar.

Naturalmente, Bialy e Loeb reconhecem que ainda há muitas incógnitas para dizer com certeza o que 'Oumuamua realmente é. E mesmo que isso aconteça ser um pedaço de rocha natural, todos os outros asteroides e cometas que foram previamente detectados tiveram ordens de magnitude de massa para área maiores do que as estimativas atuais para 'Oumuamua.

Isso, e o fato de que a pressão de radiação parece ser capaz de acelerá-la, significaria que 'Oumuamua representa uma nova classe de material interestelar fino que nunca foi visto antes. Se for verdade, isso abre um novo conjunto de mistérios, tais como como esse material foi produzido e por que (ou por quem).

Embora esteja fora do alcance de nossos telescópios há quase um ano, 'Oumuamua certamente continuará sendo objeto de intenso estudo nos próximos anos. E você pode apostar que os astrônomos estarão à procura de mais deles! 

Este artigo foi originalmente publicado pela Universe Today. Leia o artigo original.

          Was Oumuamua A Solar Sail From An Alien Civilization That Flew Past Earth Last Year? Entertaining But Implausible Suggestion      Cache   Translate Page      

Short summary: it's an entertaining but rather far fetched proposal in an arxiv preprint not published anywhere but mentioned in a Scientific American op ed. Implausible for many reasons including its spectrum which is not the shiny spectrum you'd expect from a solar sail but the red of tholins mixed with rock and metal as you'd expect from an asteroid / comet.

read more


          1-OGC: The first open gravitational-wave catalog of binary mergers from analysis of public Advanced LIGO data      Cache   Translate Page      
arXiv:1811.01921

by: Nitz, Alexander H.
Abstract:
We present the first Open Gravitational-wave Catalog (1-OGC), obtained by using the public data from Advanced LIGO's first observing to search for compact-object binary mergers. Our analysis is based on new methods that improve the separation between signals and noise in matched-filter searches for gravitational waves from the merger of compact objects. The three most significant signals in our catalog correspond to the binary black hole mergers GW150914, GW151226, and LVT151012. We observe these signals at a true discovery rate of $99.92\%$. We find that LVT151012 alone has a 97.6$\%$ probability of being astrophysical in origin. No other significant binary black hole candidates are found, nor did we observe any significant binary neutron star or neutron star--black hole candidates. We make available our complete catalog of events, including the sub-threshold population of candidates.
          Leading soft theorem for multiple gravitinos      Cache   Translate Page      
arXiv:1811.01804

by: Jain, Diksha
Abstract:
We compute leading soft theorem for multiple gravitino (and graviton) in an arbitrary theory of supergravity with arbitrary number of finite energy particles with arbitrary mass and arbitrary spin by extending Sen's approach \cite{Sen:2017xjn} to fermionic symmetry. Our result is true for any compactification of type II and Heterotic superstring theory. Our result is valid at all orders in perturbation for four and higher spacetime dimensions.
          Holographic complexity under a global quantum quench      Cache   Translate Page      
arXiv:1811.01473

by: Fan, Zhong-Ying
Abstract:
There are several different proposals, relating holographic complexity to the gravitational objects defined on the Wheeler-DeWitt patch. In this paper, we investigate the evolution of complexity following a global quantum quench for these proposals. We find that surprisingly they all reproduce known properties of complexity, such as the switchback effect. However, each of these proposals also has its own characteristic features during the dynamical evolution, which may serve as a powerful tool to distinguish the various holographic duals of complexity.
          Analytic example of the Aretakis type behaviour of the metric      Cache   Translate Page      
arXiv:1811.01371

by: Akhmedov, E.T.
Abstract:
We consider an extremal Reissner-Nordstr\"om black hole perturbed by a neutral massive point particle, which falls in radially. We study the linear metric perturbation in the vicinity of the black hole and find that the $l=0$ and $l=1$ spherical modes of the metric oscillate rather than decay.
          The $\Omega_{c}$-puzzle solved by means of spectrum and strong decay amplitude predictions      Cache   Translate Page      
arXiv:1811.01799

by: Santopinto, E.
Abstract:
The observation of new $\Omega_{c}=ssc$ states by LHCb \cite{Aaij:2017nav} and the confirmation of four of them by Belle \cite{Yelton:2017qxg} may represent an important milestone in our understanding of the quark organization inside hadrons. By providing results for the spectrum of $\Omega_{c(b)}$ baryons and predictions for their $\Xi _{c(b)}^{+}K^{-}$ decay channels, we suggest a possible solution to the $\Omega_{c}$ quantum number puzzle. We also discuss why the set of $\Omega_{c(b)}$ baryons are the most suitable environment to test the validity of three-quark and quark-diquark effective degrees of freedom. Finally, we calculate the masses and the partial decay widths of the $\Xi_b(6227)$ and $\Sigma_b(6097)$ states, just observed by LHCb \cite{Aaij:2018yqz,Aaij:2018tnn}. Our results are in good agreement with LHCb experimental data.
          Decays of Pentaquarks in Hadrocharmonium and Molecular Pictures      Cache   Translate Page      
arXiv:1811.01691

by: Eides, Michael I.
Abstract:
We consider decays of the hidden charm LHCb pentaquarks in the hadrocharmonium and molecular scenarios. In both pictures the LHCb pentaquarks are essentially nonrelativistic bound states. We develop a semirelativistic framework for calculation of the partial decay widths that allows the final particles to be relativistic. Using this approach we calculate the decay widths in the hadrocharmonium and molecular pictures. Molecular hidden charm pentaquarks are constructed as loosely bound states of charmed and anticharmed hadrons. Calculations show that molecular pentaquarks decay predominantly into states with open charm. Strong suppression of the molecular pentaquark decays into states with hidden charm is qualitatively explained by a relatively large size of the molecular pentaquark. The decay pattern of hadrocharmonium pentaquarks that are interpreted as loosely bound states of excited charmonium $\psi'$ and nucleons is quite different. This time dominate decays into states with hidden charm, but suppression of the decays with charm exchange is weaker than in the respective molecular case. The weaker suppression is explained by a larger binding energy and respectively smaller size of the hadrocharmonium pentaquarks. These results combined with the experimental data on partial decay widths could allow to figure out which of the two theoretical scenarios for pentaquarks (if either) is chosen by nature.
          On the covariance of scalar averaging and backreaction in relativistic inhomogeneous cosmology      Cache   Translate Page      
arXiv:1811.01374

by: Heinesen, Asta
Abstract:
We introduce a generalization of the 4-dimensional averaging window function of Gasperini, Marozzi and Veneziano (2010) that may prove useful for a number of applications. The covariant nature of spatial scalar averaging schemes to address the averaging problem in relativistic cosmology is an important property that is implied by construction, but usually remains implicit. We employ here the approach of Gasperini et al. for two reasons. First, the formalism and its generalization presented here are manifestly covariant. Second, the formalism is convenient for disentangling the dependencies on foliation, volume measure, and boundaries in the averaged expressions entering in scalar averaging schemes. These properties will prove handy for simplifying expressions, but also for investigating extremal foliations and for comparing averaged properties of different foliations directly. The proposed generalization of the window function allows for choosing the most appropriate averaging scheme for the physical problem at hand, and for distinguishing between the role of the foliation itself and the role of the volume measure in averaged dynamic equations. We also show that one particular window function obtained from this generalized class results in an averaging scheme corresponding to that of a recent investigation by Buchert, Mourier and Roy (2018) and, as a byproduct, we explicitly show that the general equations for backreaction derived therein are covariant.
          Higher Spin ANEC and the Space of CFTs      Cache   Translate Page      
arXiv:1811.01913

by: Meltzer, David
Abstract:
We study the positivity properties of the leading Regge trajectory in higher-dimensional, unitary, conformal field theories (CFTs). These conditions correspond to higher spin generalizations of the averaged null energy condition (ANEC). By studying higher spin ANEC, we will derive new bounds on the dimensions of charged, spinning operators and prove that if the Hofman-Maldacena bounds are saturated, then the theory has a higher spin symmetry. We also derive new, general bounds on CFTs, with an emphasis on theories whose spectrum is close to that of a generalized free field theory. As an example, we consider the Ising CFT and show how the OPE structure of the leading Regge trajectory is constrained by causality. Finally, we use the analytic bootstrap to perform additional checks, in a large class of CFTs, that higher spin ANEC is obeyed at large and finite spin. In the process, we calculate corrections to large spin OPE coefficients to one-loop and higher in holographic CFTs.
          The inner engine of GeV-radiation-emitting gamma-ray bursts      Cache   Translate Page      
arXiv:1811.01839

by: Ruffini, R.
Abstract:
We motivate how the most recent progress in the understanding the nature of the GeV radiation in most energetic gamma-ray bursts (GRBs), the binary-driven hypernovae (BdHNe), has led to the solution of a forty years unsolved problem in relativistic astrophysics: how to extract the rotational energy from a Kerr black hole for powering synchrotron emission and ultra high-energy cosmic rays. The \textit{inner engine} is identified in the proper use of a classical solution introduced by Wald in 1972 duly extended to the most extreme conditions found around the newly-born black hole in a BdHN. The energy extraction process occurs in a sequence impulsive processes each accelerating protons to $10^{21}$~eV in a timescale of $10^{-6}$~s and in presence of an external magnetic field of $10^{13}$~G. Specific example is given for a black hole of initial angular momentum $J=0.3\,M^2$ and mass $M\approx 3\,M_\odot$ leading to the GeV radiation of $10^{49}$~erg$\cdot$s$^{-1}$. The process can energetically continue for thousands of years.
          Post-Newtonian phase accuracy requirements for stellar black hole binaries with LISA      Cache   Translate Page      
arXiv:1811.01805

by: Mangiagli, Alberto
Abstract:
The Laser Interferometer Space Antenna (LISA) will observe black hole binaries of stellar origin during their gravitational wave inspiral, months to years before coalescence. Due to the long duration of the signal in the LISA band, a faithful waveform is necessary in order to keep track of the binary phase. This is crucial to extract the signal from the data and to perform an unbiased estimation of the source parameters. We consider Post-Newtonian (PN) waveforms, and analyze the PN order needed to keep the bias caused by the PN approximation negligible relative to the statistical parameter estimation error, as a function of the source parameters. By considering realistic population models, we conclude that for $\sim 90\%$ of the stellar black hole binaries detectable by LISA, waveforms at low Post-Newtonian (PN) order (PN $\le 2$) are sufficiently accurate for an unbiased recovery of the source parameters. Our results provide a first estimate of the trade-off between waveform accuracy and information recovery for this class of LISA sources.
          An Anisotropic stellar model      Cache   Translate Page      
arXiv:1811.01548

by: Kokkinos, D.
Abstract:
We present the matching of two solutions belonging both to Carter's family [A] of metrics.The interior solution has been found by one of us [1] and represents an anisotropic fluid, the exterior solution is the vacuum member of Carter's family of metrics [2].We study the model resulting from the matching procedure and we give some perspectives of our work.15 pages
          Black holes/naked singularities in four-dimensional non-static space-time and the energy-momentum distributions      Cache   Translate Page      
arXiv:1811.01707

by: Ahmed, Faizuddin
Abstract:
In this article, we discuss four dimensional non-static space-times in the background of de-Sitter and anti-de Sitter spaces with the matter-energy sources a stiff fluid, anisotropic fluid, and an electromagnetic field. Under various parameter conditions the solutions may represent models of naked singularity and/or black holes. Finally, the energy-momentum distributions using the complexes of Landau-Lifshitz, Einstein, Papapetrou, and M{\o}ller prescriptions, were evaluated.
          Two new approaches to the anomalous limit of Brans-Dicke to Einstein gravity      Cache   Translate Page      
arXiv:1811.01728

by: Faraoni, Valerio
Abstract:
Contrary to common belief, (electro)vacuum Brans-Dicke gravity does not reduce to general relativity for large Brans-Dicke coupling $\omega$, a problem which has never been fully solved. Two new approaches, independent from each other, shed light on this issue producing the same result: in the limit $\omega \rightarrow \infty $ an (electro)vacuum Brans-Dicke spacetime reduces to a solution of the Einstein equations sourced, not by (electro)vacuum, but by a minimally coupled scalar field. The latter is shown to coincide with the Einstein frame scalar field. The first method employs a direct analysis of the Einstein frame, while the second (complementary and independent) method uses an imperfect fluid representation of Brans-Dicke gravity together with a little known 1-parameter symmetry group of this theory.
          Supersymmetry and $T \overline{T}$ Deformations      Cache   Translate Page      
EFI-18-17
arXiv:1811.01895

by: Chang, Chih-Kai
Abstract:
We propose a manifestly supersymmetric generalization of the solvable $T \overline{T}$ deformation of two-dimensional field theories. For theories with $(1,1)$ and $(0,1)$ supersymmetry, the deformation is defined by adding a term to the superspace Lagrangian built from a superfield containing the supercurrent. We prove that the energy levels of the resulting deformed theory are determined exactly in terms of those of the undeformed theory. This supersymmetric deformation extends to higher dimensions, where we conjecture that it might provide a higher-dimensional analogue of $T \overline{T}$, producing supersymmetric Dirac or Dirac-Born-Infeld actions in special cases.
          Revisiting the Number-Theory Dark Matter Scenario and the Weak Gravity Conjecture      Cache   Translate Page      
arXiv:1811.01755
UT-18-25
TU-1075
IPMU18-0177

by: Nakayama, Kazunori
Abstract:
We revisit the number-theory dark matter scenario where one of the light chiral fermions required by the anomaly cancellation conditions of U(1)_{B-L} explains dark matter. Focusing on some of the integer B-L charge assignments, we explore a new region of the parameter space where there appear two light fermions and the heavier one becomes a dark matter of mass O(10)keV or O(10)MeV. The dark matter radiatively decays into neutrino and photon, which can explain the tantalizing hint of the 3.55keV X-ray line excess. Interestingly, the other light fermion can erase the AdS vacuum around the neutrino mass scale in a compactification of the standard model to 3D. This will make the standard model consistent with the AdS-WGC statement that stable non-supersymmetric AdS vacua should be absent.
          Gauss-Bonnet inflation and swampland      Cache   Translate Page      
arXiv:1811.01625

by: Yi, Zhu
Abstract:
The two swampland criteria are generically in tension with the single field slow-roll inflation because the first swampland criterion requires small tensor to scalar ratio while the second swampland criterion requires large tensor to scalar ratio. The challenge to the single field slow-roll inflation imposed by the swampland criteria can be avoided by modifying the relationship between the tensor to scalar ratio and the slow-roll parameter. We show that the Gauss-Bonnet inflation with the coupling function inversely proportional to the potential overcomes the challenge by adding a constant factor in the relationship between the tensor to scalar ratio and the slow-roll parameter. For the Gauss-Bonnet inflation, while the swampland criteria are satisfied, the slow-roll conditions are also fulfilled, so the scalar spectral tilt and the tensor to scalar ratio are consistent with the observations. We use the potentials for chaotic inflation and the E-model as examples to show that the models pass all the constraints. The swampland criteria may imply Gauss-Bonnet coupling.
          Interference of Electromagnetic waves in the background of the gravitational waves      Cache   Translate Page      
arXiv:1811.01489

by: Wang, Wenyu
Abstract:
Based on the relationship between proper distance and coordinate distance, the geometrical phenomenon caused by the passing gravitational waves can not be observed locally. The electromagnetic wave equations in the background gravitational waves are studied. We find that the expansion and contraction of wave lengths are always synchronous with the objects it measures. The background of the gravitational waves leads to dissipation and dispersion in the propagation of electromagnetic wave. The phase of the gravitational waves control the dissipation term and dispersion term in the telegrapher's equation. The linearly polarized laser beam propagating in the direction of the incoming gravitational waves can give a possible measurement on the local metric. In case of the pulsed beats passing by, the relaxation time is greater than the period of the gravitational waves, thus the detector may only show a signal of the modulation of the beats. Finally we proposed a non-local interference experiment to detect the high-frequency the gravitational waves. It is similar to the measurement of redshift caused by gravitation. Together with the ordinary detector, it will give us further and mutually measurements of the gravitational waves.
          Infrared resummation for derivative interactions in de Sitter space      Cache   Translate Page      
NCTS-TH/1814
arXiv:1811.01830

by: Kitamoto, Hiroyuki
Abstract:
In de Sitter space, scale invariant fluctuations give rise to infrared logarithmic corrections to physical quantities, which spoil perturbation theories eventually. For non-derivative interactions, it has been known that the field equation reduces to a Langevin equation with a white noise in the leading logarithm approximation. The stochastic equation allows us to evaluate the infrared effects nonperturbatively. We extend the resummation formula as it is applicable also in models with derivative interactions. We first consider the nonlinear sigma model, and next consider a more general model consisting of a non-canonical kinetic term and a potential term. The stochastic equations derived from the infrared resummation in these models can be understood as generalizations of the standard one to curved target spaces.
          Massless Cosmic Strings in Expanding Universe      Cache   Translate Page      
arXiv:1811.01563

by: Fursaev, Dmitri V.
Abstract:
Circular massless cosmic strings which move with the speed of light in the de Sitter universe are described. Construction of the background geometry is based on parabolic isometries of the de Sitter spacetime. Microscopic circular cosmic strings may appear at the Planck epoch and then grow up to the Hubble size. We analyze: images of the strings, influence of strings on trajectories of matter, formation of overdensities, and shifts of energies of photons. These effects allow one to discriminate massless strings from their massive cousins. The present work extends our results on straight massless cosmic strings in Minkowsky spacetime to curved backgrounds.
          Rotating and non-linear magnetic-charged black hole surrounded by quintessence      Cache   Translate Page      
arXiv:1811.01562

by: Benavides-Gallego, Carlos A.
Abstract:
In this work we derived a rotating and non-linear magnetic-charged black hole surrounded by quintessence using the Newman-Janis algorithm. Considering the state parameter $\omega_q=-3/2$, we studied the event horizons, the ergosphere, and the ZAMO. We found that the existence of the outer horizon is constrained by the values of the charge $Q$. Furthermore, we found that the ergo-region increases when both the charge $Q$ and the spin parameter $a$ are increased. On the other hand, regarding equatorial circular orbits, we studied the limit given by the static radius on the existence of circular geodesics, the photon circular geodesics, and the innermost stable circular orbits (ISCO). We show that photon circular orbits do not depend strongly on $Q$, and $r_{ISCO}$ is constrained by the values of charge.
          Curved spacetime effective field theory (cEFT) -- construction with the heat kernel method      Cache   Translate Page      
arXiv:1811.01656

by: Nakonieczny, Łukasz
Abstract:
In the presented paper we tackle the problem of the effective field theory in curved spacetime (cEFT) construction. To this end, we propose to use the heat kernel method. After introducing the general formalism based on the well established formulas known from the application of the heat kernel method to deriving the one-loop effective action in curved spacetime, we tested it on selected problems. The discussed examples were chosen to serve as a check of validity of the derived formulas by comparing the obtained results to the known flat spacetime calculations. On the other hand, they allowed us to obtain new results concerning the influence of the gravity induced operators on the effective field theory without unnecessary calculational complications.
          Transitioning from equal-time to light-front quantization in $\phi_2^4$ theory      Cache   Translate Page      
arXiv:1811.01685

by: Chabysheva, Sophia S.
Abstract:
We implement the limiting procedure of Hornbostel for the quantization of two-dimensional $\phi^4$ theory in a sequence of coordinate systems that interpolate between equal-time and light-front coordinates. This allows computation of the vacuum state in the odd and even sectors of the theory and computation of massive states built on these vacua. Results are compared with those of the equal-time calculations of Rychkov and Vitale and those of standard light-front calculations.
          Holographic RG flows and $AdS_5$ black strings from 5D half-maximal gauged supergravity      Cache   Translate Page      
arXiv:1811.01608

by: Dao, H.L.
Abstract:
We study five-dimensional $N=4$ gauged supergravity coupled to five vector multiplets with compact and non-compact gauge groups $U(1)\times SU(2)\times SU(2)$ and $U(1)\times SO(3,1)$. For $U(1)\times SU(2)\times SU(2)$ gauge group, we identify $N=4$ $AdS_5$ vacua with $U(1)\times SU(2)\times SU(2)$ and $U(1)\times SU(2)_{\textrm{diag}}$ symmetries and analytically construct the corresponding holographic RG flow interpolating between these two critical points. The flow describes a deformation of the dual $N=2$ SCFT driven by vacuum expaction values of dimension-two operators. In addition, we study $AdS_3\times S^2$ geometries dual to twisted compactifications of $N=2$ SCFTs with flavor symmetry $SU(2)$ and without flavor symmetry. We find a number of $AdS_3\times S^2$ solutions preserving eight supercharges for different twists from $U(1)\times U(1)\times U(1)$ and $U(1)\times U(1)_{\textrm{diag}}$ gauge fields. We numerically construct various RG flow solutions interpolating between $N=4$ $AdS_5$ ciritcal points and these $AdS_3\times S^2$ geometries in the IR. The solutions can also be interpreted as supersymmetric black strings in asymptotically $AdS_5$ space. These types of holographic solutions are also studied in non-compact $U(1)\times SO(3,1)$ gauge group. In this case, only one $N=4$ $AdS_5$ vacuum exists, and we give RG flows solutions from this $AdS_5$ to a singular geometry in the IR corresponding to an $N=2$ non-conformal field theory. An $AdS_3\times H^2$ solution together with an RG flow between this vacuum and the $N=4$ $AdS_5$ are also given for this gauge group. \end{abstract}
          Harvard-Smithsonian astronomers: could the mysterious interstellar object be part of an ET probe?      Cache   Translate Page      

First discovered a year ago, Oumuamua is the strange cigar-shaped object of interstellar origin that flew through our solar system at 196,000 mph. Since it was first spotted, scientists haven't decisively determined whether it's a mildly active comet or something else. Now, astronomers Shmuel Bialy and Abraham Loeb of the Harvard Smithsonian Center for Astrophysics have released a scientific paper asking if Oumuamua could be a "lightsail of artificial origin," part of a space probe developed by an advanced extraterrestrial civilization. Of course this is not a statement of fact but rather a question, albeit a very very interesting one. From CNN:

"'Oumuamua may be a fully operational probe sent intentionally to Earth vicinity by an alien civilization," they wrote in the paper, which has been submitted to the Astrophysical Journal Letters.

The theory is based on the object's "excess acceleration," or its unexpected boost in speed as it traveled through and ultimately out of our solar system in January 2018.

"Considering an artificial origin, one possibility is that 'Oumuamua is a light sail, floating in interstellar space as a debris from an advanced technological equipment," wrote the paper's authors, suggesting that the object could be propelled by solar radiation.

"COULD SOLAR RADIATION PRESSURE EXPLAIN ‘OUMUAMUA’S PECULIAR ACCELERATION?" (PDF)

(image: artist's impression of Oumuamua, ESO/M. Kornmesser) Read the rest


          Físicos escrevem equação para fazer uma pizza perfeita em casa       Cache   Translate Page      
Pizza de margherita leva queijo, mussarela e manjericão (Foto: Flickr/Shoichi Iwashita/Creative Commons)

 

Nem todo mundo tem como ir até Nápoles, na Itália, sempre que bate aquela vontade de comer uma pizza perfeita. Mas três pesquisadores resolveram fazer um artigo cientifício com a fórmula ideal para preparar o prato em casa. Trata-se do estudo "A Física para Assar uma Boa Pizza", publicado no periódico arXiv. 

Andrey Varlamov, do Instituto de Supercondutores, Óxidos e Outros Materiais Inovadores e Dispositivos em Roma; Andreas Glatz, da Universidade Northern Illinois, nos Estados Unidos; e Sergio Grasso, antropólogo alimentar e cineasta italino, acompanharam um pizzaiolo preparando diversas pizzas de margherita para a análise. 

Segundo o profissional, o segredo era o forno de tijolos. Com lenha queimando em um canto, o calor entra uniformemente através das paredes curvas e pelo chão de pedra do forno, garantindo que todo diâmetro da massa seja assado. Sob condições ideais, uma pizza de margherita pode ser assada perfeitamente em exatamente dois minutos, a 330 graus Celsius.

Quando recheios extras requerem tempo adicional de cozimento, os pizzaiolos podem levantar a massa com uma espátula de madeira ou alumínio por 30 segundos ou mais. A intenção é propagar o calor pela massa, evitando que o fundo fique queimado. 

Leia também: 
+ Estudo mostra em quais horários as pessoas mais pedem comida à noite
+ Dietas com nenhum ou muito carboidrato são igualmente ruins para a saúde

A fórmula  
No forno elétrico, encontrado em muitas residências, a pizza é assada em uma assadeira de metal. Como o metal tem uma condutividade térmica mais poderosa do que a do tijolo, a parte inferior da massa absorverá calor muito mais rapidamente do que o restante. Neste caso, deixar a temperatura em 330 graus poderia transformar a redonda em carvão – e não é isso que você quer quando está com fome. 

Usando uma longa equação termodinâmica, os pesquisadores determinaram que a pizza caseira deve ser assada em uma temperatura de 230 graus Celsius por exatos 170 segundos.

Os cientistas também afirmaram que os recheios com maior teor de água, basicamente os que têm vegetais, podem precisar ficar por mais tempo, visto que a pizza irá retornar mais calor para o forno por causa da evaporação.

Agora que você já aprendeu a fórmula, é hora de colocar a mão na massa – literalmente. 

Curte o conteúdo da GALILEU? Tem mais de onde ele veio: baixe o app da Globo Mais para ver reportagens exclusivas e ficar por dentro de todas as publicações da Editora Globo. Você também pode assinar a revista, por R$ 4,90 e baixar o app da GALILEU.


          Even a Few Bots Can Shift Public Opinion in Big Ways      Cache   Translate Page      

Nearly two-thirds of the social media bots with political activity on Twitter before the 2016 U.S. presidential election supported Donald Trump.


          contextual: Evaluating Contextual Multi-Armed Bandit Problems in R. (arXiv:1811.01926v1 [cs.LG])      Cache   Translate Page      

Authors: Robin van Emden, Maurits Kaptein

Over the past decade, contextual bandit algorithms have been gaining in popularity due to their effectiveness and flexibility in solving sequential decision problems---from online advertising and finance to clinical trial design and personalized medicine. At the same time, there are, as of yet, surprisingly few options that enable researchers and practitioners to simulate and compare the wealth of new and existing bandit algorithms in a standardized way. To help close this gap between analytical research and empirical evaluation the current paper introduces the object-oriented R package "contextual": a user-friendly and, through its object-oriented structure, easily extensible framework that facilitates parallelized comparison of contextual and context-free bandit policies through both simulation and offline analysis.


          Mobile Edge Cloud: Opportunities and Challenges. (arXiv:1811.01929v1 [cs.DC])      Cache   Translate Page      

Authors: Sayed Chhattan Shah

Mobile edge cloud is emerging as a promising technology to the internet of things and cyber-physical system applications such as smart home and intelligent video surveillance. In a smart home, various sensors are deployed to monitor the home environment and physiological health of individuals. The data collected by sensors are sent to an application, where numerous algorithms for emotion and sentiment detection, activity recognition and situation management are applied to provide healthcare- and emergency-related services and to manage resources at the home. The executions of these algorithms require a vast amount of computing and storage resources. To address the issue, the conventional approach is to send the collected data to an application on an internet cloud. This approach has several problems such as high communication latency, communication energy consumption and unnecessary data traffic to the core network. To overcome the drawbacks of the conventional cloud-based approach, a new system called mobile edge cloud is proposed. In mobile edge cloud, multiple mobiles and stationary devices interconnected through wireless local area networks are combined to create a small cloud infrastructure at a local physical area such as a home. Compared to traditional mobile distributed computing systems, mobile edge cloud introduces several complex challenges due to the heterogeneous computing environment, heterogeneous and dynamic network environment, node mobility, and limited battery power. The real-time requirements associated with the internet of things and cyber-physical system applications make the problem even more challenging. In this paper, we describe the applications and challenges associated with the design and development of mobile edge cloud system and propose an architecture based on a cross layer design approach for effective decision making.


          DAPPER: Scaling Dynamic Author Persona Topic Model to Billion Word Corpora. (arXiv:1811.01931v1 [stat.ML])      Cache   Translate Page      

Authors: Robert Giaquinto, Arindam Banerjee

Extracting common narratives from multi-author dynamic text corpora requires complex models, such as the Dynamic Author Persona (DAP) topic model. However, such models are complex and can struggle to scale to large corpora, often because of challenging non-conjugate terms. To overcome such challenges, in this paper we adapt new ideas in approximate inference to the DAP model, resulting in the DAP Performed Exceedingly Rapidly (DAPPER) topic model. Specifically, we develop Conjugate-Computation Variational Inference (CVI) based variational Expectation-Maximization (EM) for learning the model, yielding fast, closed form updates for each document, replacing iterative optimization in earlier work. Our results show significant improvements in model fit and training time without needing to compromise the model's temporal structure or the application of Regularized Variation Inference (RVI). We demonstrate the scalability and effectiveness of the DAPPER model by extracting health journeys from the CaringBridge corpus --- a collection of 9 million journals written by 200,000 authors during health crises.


          A personal model of trumpery: Deception detection in a real-world high-stakes setting. (arXiv:1811.01938v1 [cs.CL])      Cache   Translate Page      

Authors: Sophie van der Zee, Ronald Poppe, Alice Havrileck, Aurelien Baillon

Language use reveals information about who we are and how we feel1-3. One of the pioneers in text analysis, Walter Weintraub, manually counted which types of words people used in medical interviews and showed that the frequency of first-person singular pronouns (i.e., I, me, my) was a reliable indicator of depression, with depressed people using I more often than people who are not depressed4. Several studies have demonstrated that language use also differs between truthful and deceptive statements5-7, but not all differences are consistent across people and contexts, making prediction difficult8. Here we show how well linguistic deception detection performs at the individual level by developing a model tailored to a single individual: the current US president. Using tweets fact-checked by an independent third party (Washington Post), we found substantial linguistic differences between factually correct and incorrect tweets and developed a quantitative model based on these differences. Next, we predicted whether out-of-sample tweets were either factually correct or incorrect and achieved a 73% overall accuracy. Our results demonstrate the power of linguistic analysis in real-world deception research when applied at the individual level and provide evidence that factually incorrect tweets are not random mistakes of the sender.


          Operation Control Protocols in Power Distribution Grids. (arXiv:1811.01942v1 [cs.LO])      Cache   Translate Page      

Authors: Yehia Abd Alrahman, Hugo Torres Vieira

Future power distribution grids will comprise a large number of components, each potentially able to carry out operations autonomously. Clearly, in order to ensure safe operation of the grid, individual operations must be coordinated among the different components. Since operation safety is a global property, modelling component coordination typically involves reasoning about systems at a global level. In this paper, we propose a language for specifying grid operation control protocols from a global point of view. We show how such global specifications can be used to automatically generate local controllers of individual components, and that the distributed implementation yielded by such controllers operationally corresponds to the global specification. We showcase our development by modelling a fault management scenario in power grids.


          A practical method for the consistent identification of a module in a dynamical network. (arXiv:1811.01943v1 [cs.SY])      Cache   Translate Page      

Authors: Michel Gevers, Alexandre Sanfelice Bazanella, Gian Vianna da Silva

We present a new and simple method for the identification of a single transfer function that is embedded in a dynamical network. In existing methods the consistent identification of the desired transfer function relies on the positive definiteness of the spectral density matrix of the vector of all node signals, and it typically requires knowledge of the topology of the whole network. The positivity condition is on the internal signals and therefore can not be guaranteed a priori, in addition it is far from necessary. The new method of this paper provides simple conditions on which nodes to excite and which nodes to measure in order to produce a consistent estimate of the desired transfer function. Just as importantly, it requires knowledge of the local topology only.


          Chaotic Quantum Double Delta Swarm Algorithm using Chebyshev Maps: Theoretical Foundations, Performance Analyses and Convergence Issues. (arXiv:1811.01945v1 [cs.NE])      Cache   Translate Page      

Authors: Saptarshi Sengupta, Sanchita Basak, Richard Alan Peters II

Quantum Double Delta Swarm (QDDS) Algorithm is a new metaheuristic algorithm inspired by the convergence mechanism to the center of potential generated within a single well of a spatially co-located double-delta well setup. It mimics the wave nature of candidate positions in solution spaces and draws upon quantum mechanical interpretations much like other quantum-inspired computational intelligence paradigms. In this work, we introduce a Chebyshev map driven chaotic perturbation in the optimization phase of the algorithm to diversify weights placed on contemporary and historical, socially-optimal agents' solutions. We follow this up with a characterization of solution quality on a suite of 23 single-objective functions and carry out a comparative analysis with eight other related nature-inspired approaches. By comparing solution quality and successful runs over dynamic solution ranges, insights about the nature of convergence are obtained. A two-tailed t-test establishes the statistical significance of the solution data whereas Cohen's d and Hedge's g values provide a measure of effect sizes. We trace the trajectory of the fittest pseudo-agent over all function evaluations to comment on the dynamics of the system and prove that the proposed algorithm is theoretically globally convergent under the assumptions adopted for proofs of other closely-related random search algorithms.


          A Differential Volumetric Approach to Multi-View Photometric Stereo. (arXiv:1811.01984v1 [cs.CV])      Cache   Translate Page      

Authors: Fotios Logothetis, Roberto Mecca, Roberto Cipolla

Highly accurate 3D volumetric reconstruction is still an open research topic where the main difficulties are usually related to merging rough estimations with high frequency details. One of the most promising methods is the fusion between multi-view stereo and photometric imaging 3D shape reconstruction techniques. However, beside the intrinsic difficulties that multi-view stereo and photometric stereo have to make them working reliably, supplementary problems raise when considered together. Most importantly, the projection of the fine details usually retrievable with photometric stereo onto the rough multi-view stereo reconstruction is difficult to handle.

In this work, we present a volumetric approach to the multi-view photometric stereo problem defined by a unified differential model. The key to our method is the signed distance field parameterisation which avoids the complex step of re-projecting high frequency details as the parameterisation of the whole volume allows a photometric modeling on the volume itself efficiently dealing with occlusions, discontinuities, etc. The relation between the surface normals and the gradient of the signed distance field leads to a homogeneous linear partial differential equation. A variational optimisation is adopted in order to combine multiple images from multiple points of view in a single system avoiding the need of merging depth maps. Our approach is evaluated on synthetic and real data-sets and achieves state-of-the-art results.


          Compact Personalized Models for Neural Machine Translation. (arXiv:1811.01990v1 [cs.CL])      Cache   Translate Page      

Authors: Joern Wuebker, Patrick Simianer, John DeNero

We propose and compare methods for gradient-based domain adaptation of self-attentive neural machine translation models. We demonstrate that a large proportion of model parameters can be frozen during adaptation with minimal or no reduction in translation quality by encouraging structured sparsity in the set of offset tensors during learning via group lasso regularization. We evaluate this technique for both batch and incremental adaptation across multiple data sets and language pairs. Our system architecture - combining a state-of-the-art self-attentive model with compact domain adaptation - provides high quality personalized machine translation that is both space and time efficient.


          The Sparsest Additive Spanner via Multiple Weighted BFS Trees. (arXiv:1811.01997v1 [cs.DC])      Cache   Translate Page      

Authors: Keren Censor-Hillel, Ami Paz, Noam Ravid

Spanners are fundamental graph structures that sparsify graphs at the cost of small stretch. In particular, in recent years, many sequential algorithms constructing additive all-pairs spanners were designed, providing very sparse small-stretch subgraphs. Remarkably, it was then shown that the known (+6)-spanner constructions are essentially the sparsest possible, that is, a larger additive stretch cannot guarantee a sparser spanner, which brought the stretch-sparsity trade-off to its limit. Distributed constructions of spanners are also abundant. However, for additive spanners, while there were algorithms constructing (+2) and (+4)-all-pairs spanners, the sparsest case of (+6)-spanners remained elusive.

We remedy this by designing a new sequential algorithm for constructing a (+6)-spanner with the essentially-optimal sparsity of roughly O(n^{4/3}) edges. We then show a distributed implementation of our algorithm, answering an open problem in [Censor-Hillel et al., DISC 2016].

A main ingredient in our distributed algorithm is an efficient construction of multiple weighted BFS trees. A weighted BFS tree is a BFS tree in a weighted graph, that consists of the lightest among all shortest paths from the root to each node. We present a distributed algorithm in the CONGEST model, that constructs multiple weighted BFS trees in |S|+D-1 rounds, where S is the set of sources and D is the diameter of the network graph.


          Blockchain-based Privacy-Preserving Charging Coordination Mechanism for Energy Storage Units. (arXiv:1811.02001v1 [cs.CR])      Cache   Translate Page      

Authors: Mohamed Baza, Mahmoud Nabil, Muhammad Ismail, Mohamed Mahmoud, Erchin Serpedin, Mohammad Rahman

Energy storage units (ESUs) enable several attractive features of modern smart grids such as enhanced grid resilience, effective demand response, and reduced bills. However, uncoordinated charging of ESUs stresses the power system and can lead to mass blackout. On the other hand, existing charging coordination mechanisms suffer from several limitations. First, the need for a central charging controller (CC) presents a single point of failure that jeopardizes the effectiveness of the charging coordination. Second, existing mechanisms overlook privacy concerns of the involved customers. To address these limitations, in this paper, we leverage the blockchain and smart contracts to build a decentralized charging coordination mechanism without the need for centralized charging coordinator. After each ESU sends a charging request to the smart contract address on the blockchain, the smart contract will run the charging coordination mechanism in a self-executed manner such that ESUs with the highest priority are charged in the present time slot while charging requests of lower priority ESUs are deferred to future time slots. We have implemented the proposed mechanism on the Ethereum blockchain and our analysis shows that the blockchain enables a decentralized charging coordination with increased transparency, reliability, and privacy preservation.


          Finding Mixed Nash Equilibria of Generative Adversarial Networks. (arXiv:1811.02002v1 [cs.LG])      Cache   Translate Page      

Authors: Ya-Ping Hsieh, Chen Liu, Volkan Cevher

We reconsider the training objective of Generative Adversarial Networks (GANs) from the mixed Nash Equilibria (NE) perspective. Inspired by the classical prox methods, we develop a novel algorithmic framework for GANs via an infinite-dimensional two-player game and prove rigorous convergence rates to the mixed NE, resolving the longstanding problem that no provably convergent algorithm exists for general GANs. We then propose a principled procedure to reduce our novel prox methods to simple sampling routines, leading to practically efficient algorithms. Finally, we provide experimental evidence that our approach outperforms methods that seek pure strategy equilibria, such as SGD, Adam, and RMSProp, both in speed and quality.


          A Toolbox For Property Checking From Simulation Using Incremental SAT (Extended Abstract). (arXiv:1811.02005v1 [cs.SE])      Cache   Translate Page      

Authors: Rob Sumners (Centaur Technology)

We present a tool that primarily supports the ability to check bounded properties starting from a sequence of states in a run. The target design is compiled into an AIGNET which is then selectively and iteratively translated into an incremental SAT instance in which clauses are added for new terms and simplified by the assignment of existing literals. Additional applications of the tool can be derived by the user providing alternative attachments of constrained functions which guide the iterations and SAT checks performed. Some Verilog RTL examples are included for reference.


          Hardware Distortion Correlation Has Negligible Impact on UL Massive MIMO Spectral Efficiency. (arXiv:1811.02007v1 [cs.IT])      Cache   Translate Page      

Authors: Emil Björnson, Luca Sanguinetti, Jakob Hoydis

This paper analyzes how the distortion created by hardware impairments in a multiple-antenna base station affects the uplink spectral efficiency (SE), with focus on Massive MIMO. This distortion is correlated across the antennas, but has been often approximated as uncorrelated to facilitate (tractable) SE analysis. To determine when this approximation is accurate, basic properties of distortion correlation are first uncovered. Then, we separately analyze the distortion correlation caused by third-order non-linearities and by quantization. Finally, we study the SE numerically and show that the distortion correlation can be safely neglected in Massive MIMO when there are sufficiently many users. Under i.i.d. Rayleigh fading and equal signal-to-noise ratios (SNRs), this occurs for more than five transmitting users. Other channel models and SNR variations have only minor impact on the accuracy. We also demonstrate the importance of taking the distortion characteristics into account in the receive combining.


          Towards a Unified Theory of Sparsification for Matching Problems. (arXiv:1811.02009v1 [cs.DS])      Cache   Translate Page      

Authors: Sepehr Assadi, Aaron Bernstein

In this paper, we present a construction of a `matching sparsifier', that is, a sparse subgraph of the given graph that preserves large matchings approximately and is robust to modifications of the graph. We use this matching sparsifier to obtain several new algorithmic results for the maximum matching problem:

* An almost $(3/2)$-approximation one-way communication protocol for the maximum matching problem, significantly simplifying the $(3/2)$-approximation protocol of Goel, Kapralov, and Khanna (SODA 2012) and extending it from bipartite graphs to general graphs.

* An almost $(3/2)$-approximation algorithm for the stochastic matching problem, improving upon and significantly simplifying the previous $1.999$-approximation algorithm of Assadi, Khanna, and Li (EC 2017).

* An almost $(3/2)$-approximation algorithm for the fault-tolerant matching problem, which, to our knowledge, is the first non-trivial algorithm for this problem.

Our matching sparsifier is obtained by proving new properties of the edge-degree constrained subgraph (EDCS) of Bernstein and Stein (ICALP 2015; SODA 2016)---designed in the context of maintaining matchings in dynamic graphs---that identifies EDCS as an excellent choice for a matching sparsifier. This leads to surprisingly simple and non-technical proofs of the above results in a unified way. Along the way, we also provide a much simpler proof of the fact that an EDCS is guaranteed to contain a large matching, which may be of independent interest.


          A Unified Perspective of Evolutionary Game Dynamics Using Generalized Growth Transforms. (arXiv:1811.02010v1 [cs.NE])      Cache   Translate Page      

Authors: Oindrila Chatterjee, Shantanu Chakrabartty

In this paper, we show that different types of evolutionary game dynamics are, in principle, special cases of a dynamical system model based on our previously reported framework of generalized growth transforms. The framework shows that different dynamics arise as a result of minimizing a population energy such that the population as a whole evolves to reach the most stable state. By introducing a population dependent time-constant in the generalized growth transform model, the proposed framework can be used to explain a vast repertoire of evolutionary dynamics, including some novel forms of game dynamics with non-linear payoffs.


          SkyLogic - A proposal for a skyrmion logic device. (arXiv:1811.02016v1 [cs.ET])      Cache   Translate Page      

Authors: Meghna G. Mankalale, Zhengyang Zhao, Jian-Ping Wang, Sachin S. Sapatnekar

This work proposes a novel logic device (SkyLogic) based on skyrmions, which are magnetic vortex-like structures that have low depinning current density and are robust to defects. A charge current sent through a polarizer ferromagnet (P-FM) nucleates a skyrmion at the input end of an intra-gate FM interconnect with perpendicular magnetic anisotropy (PMA-FM). The output end of the PMA--FM forms the free layer of an MTJ stack. A spin Hall metal (SHM) is placed beneath the PMA-FM. The skyrmion is propagated to the output end of the PMA-FM by passing a charge current through the SHM. The resistance of the MTJ stack is low (high) when a skyrmion is present (absent) in the free layer, thereby realizing an inverter. A framework is developed to analyze the performance of the SkyLogic device. A circuit-level technique is developed that counters the transverse displacement of skyrmion in the PMA-FM and allows use of high current densities for fast propagation. The design space exploration of the PMA-FM material parameters is performed to obtain an optimal design point. At the optimal point, we obtain an inverter delay of 434 ps with a switching energy of 7.1 fJ.


          A General Theory of Equivariant CNNs on Homogeneous Spaces. (arXiv:1811.02017v1 [cs.LG])      Cache   Translate Page      

Authors: Taco Cohen, Mario Geiger, Maurice Weiler

Group equivariant convolutional neural networks (G-CNNs) have recently emerged as a very effective model class for learning from signals in the context of known symmetries. A wide variety of equivariant layers has been proposed for signals on 2D and 3D Euclidean space, graphs, and the sphere, and it has become difficult to see how all of these methods are related, and how they may be generalized.

In this paper, we present a fairly general theory of equivariant convolutional networks. Convolutional feature spaces are described as fields over a homogeneous base space, such as the plane $\mathbb{R}^2$, sphere $S^2$ or a graph $\mathcal{G}$. The theory enables a systematic classification of all existing G-CNNs in terms of their group of symmetry, base space, and field type (e.g. scalar, vector, or tensor field, etc.).

In addition to this classification, we use Mackey theory to show that convolutions with equivariant kernels are the most general class of equivariant maps between such fields, thus establishing G-CNNs as a universal class of equivariant networks. The theory also explains how the space of equivariant kernels can be parameterized for learning, thereby simplifying the development of G-CNNs for new spaces and symmetries. Finally, the theory introduces a rich geometric semantics to learned feature spaces, thus improving interpretability of deep networks, and establishing a connection to central ideas in mathematics and physics.


          Limits of Ordered Graphs and Images. (arXiv:1811.02023v1 [math.CO])      Cache   Translate Page      

Authors: Omri Ben-Eliezer, Eldar Fischer, Amit Levi, Yuichi Yoshida

The emerging theory of graph limits exhibits an interesting analytic perspective on graphs, showing that many important concepts and tools in graph theory and its applications can be described naturally in analytic language. We extend the theory of graph limits to the ordered setting, presenting a limit object for dense vertex-ordered graphs, which we call an orderon. Images are an example of dense ordered bipartite graphs, where the rows and the columns constitute the vertices, and pixel colors are represented by row-column edges; thus, as a special case, we obtain a limit object for images.

Along the way, we devise an ordered locality-preserving variant of the cut distance between ordered graphs, showing that two graphs are close with respect to this distance if and only if they are similar in terms of their ordered subgraph frequencies. We show that the space of orderons is compact with respect to this distance notion, which is key to a successful analysis of combinatorial objects through their limits. For the proof we combine techniques used in the unordered setting with several new techniques specifically designed to overcome the challenges arising in the ordered setting. We derive several results related to sampling and property testing on ordered graphs and images; For example, we describe how one can use the analytic machinery to obtain a new proof of the ordered graph removal lemma [Alon et al., FOCS 2017].


          Physics-Informed Generative Adversarial Networks for Stochastic Differential Equations. (arXiv:1811.02033v1 [stat.ML])      Cache   Translate Page      

Authors: Liu Yang, Dongkun Zhang, George Em Karniadakis

We developed a new class of physics-informed generative adversarial networks (PI-GANs) to solve in a unified manner forward, inverse and mixed stochastic problems based on a limited number of scattered measurements. Unlike standard GANs relying only on data for training, here we encoded into the architecture of GANs the governing physical laws in the form of stochastic differential equations (SDEs) using automatic differentiation. In particular, we applied Wasserstein GANs with gradient penalty (WGAN-GP) for its enhanced stability compared to vanilla GANs. We first tested WGAN-GP in approximating Gaussian processes of different correlation lengths based on data realizations collected from simultaneous reads at sparsely placed sensors. We obtained good approximation of the generated stochastic processes to the target ones even for a mismatch between the input noise dimensionality and the effective dimensionality of the target stochastic processes. We also studied the overfitting issue for both the discriminator and generator, and we found that overfitting occurs also in the generator in addition to the discriminator as previously reported. Subsequently, we considered the solution of elliptic SDEs requiring approximations of three stochastic processes, namely the solution, the forcing, and the diffusion coefficient. We used three generators for the PI-GANs, two of them were feed forward deep neural networks (DNNs) while the other one was the neural network induced by the SDE. Depending on the data, we employed one or multiple feed forward DNNs as the discriminators in PI-GANs. Here, we have demonstrated the accuracy and effectiveness of PI-GANs in solving SDEs for up to 30 dimensions, but in principle, PI-GANs could tackle very high dimensional problems given more sensor data with low-polynomial growth in computational cost.


          Out-Of-Place debugging: a debugging architecture to reduce debugging interference. (arXiv:1811.02034v1 [cs.SE])      Cache   Translate Page      

Authors: Matteo Marra (Vrije Universiteit Brussel, Belgium), Guillermo Polito (INRIA, France), Elisa Gonzalez Boix (Vrije Universiteit Brussel, Belgium)

Context. Recent studies show that developers spend most of their programming time testing, verifying and debugging software. As applications become more and more complex, developers demand more advanced debugging support to ease the software development process.

Inquiry. Since the 70's many debugging solutions were introduced. Amongst them, online debuggers provide a good insight on the conditions that led to a bug, allowing inspection and interaction with the variables of the program. However, most of the online debugging solutions introduce \textit{debugging interference} to the execution of the program, i.e. pauses, latency, and evaluation of code containing side-effects.

Approach. This paper investigates a novel debugging technique called \outofplace debugging. The goal is to minimize the debugging interference characteristic of online debugging while allowing online remote capabilities. An \outofplace debugger transfers the program execution and application state from the debugged application to the debugger application, both running in different processes.

Knowledge. On the one hand, \outofplace debugging allows developers to debug applications remotely, overcoming the need of physical access to the machine where the debugged application is running. On the other hand, debugging happens locally on the remote machine avoiding latency. That makes it suitable to be deployed on a distributed system and handle the debugging of several processes running in parallel.

Grounding. We implemented a concrete out-of-place debugger for the Pharo Smalltalk programming language. We show that our approach is practical by performing several benchmarks, comparing our approach with a classic remote online debugger. We show that our prototype debugger outperforms by a 1000 times a traditional remote debugger in several scenarios. Moreover, we show that the presence of our debugger does not impact the overall performance of an application.

Importance. This work combines remote debugging with the debugging experience of a local online debugger. Out-of-place debugging is the first online debugging technique that can minimize debugging interference while debugging a remote application. Yet, it still keeps the benefits of online debugging ( e.g. step-by-step execution). This makes the technique suitable for modern applications which are increasingly parallel, distributed and reactive to streams of data from various sources like sensors, UI, network, etc.


          Entombed: An archaeological examination of an Atari 2600 game. (arXiv:1811.02035v1 [cs.SE])      Cache   Translate Page      

Authors: John Aycock (University of Calgary, Canada), Tara Copplestone (University of York, United Kingdom)

The act and experience of programming is, at its heart, a fundamentally human activity that results in the production of artifacts. When considering programming, therefore, it would be a glaring omission to not involve people who specialize in studying artifacts and the human activity that yields them: archaeologists. Here we consider this with respect to computer games, the focus of archaeology's nascent subarea of archaeogaming.

One type of archaeogaming research is digital excavation, a technical examination of the code and techniques used in old games' implementation. We apply that in a case study of Entombed, an Atari 2600 game released in 1982 by US Games. The player in this game is, appropriately, an archaeologist who must make their way through a zombie-infested maze. Maze generation is a fruitful area for comparative retrogame archaeology, because a number of early games on different platforms featured mazes, and their variety of approaches can be compared. The maze in Entombed is particularly interesting: it is shaped in part by the extensive real-time constraints of the Atari 2600 platform, and also had to be generated efficiently and use next to no memory. We reverse engineered key areas of the game's code to uncover its unusual maze-generation algorithm, which we have also built a reconstruction of, and analyzed the mysterious table that drives it. In addition, we discovered what appears to be a 35-year-old bug in the code, as well as direct evidence of code-reuse practices amongst game developers.

What further makes this game's development interesting is that, in an era where video games were typically solo projects, a total of five people were involved in various ways with Entombed. We piece together some of the backstory of the game's development and intoxicant-fueled design using interviews to complement our technical work.

Finally, we contextualize this example in archaeology and lay the groundwork for a broader interdisciplinary discussion about programming, one that includes both computer scientists and archaeologists.


          Spectrally stable defect qubits with no inversion symmetry for robust spin-to-photon interface. (arXiv:1811.02037v1 [quant-ph])      Cache   Translate Page      

Authors: Péter Udvarhelyi, Roland Nagy, Florian Kaiser, Sang-Yun Lee, Jörg Wrachtrup, Adam Gali

Scalable spin-to-photon interfaces require quantum emitters with strong optical transition dipole moment and low coupling to phonons and stray electric fields. It is known that particularly for coupling to stray electric fields, these conditions can be simultaneously satisfied for emitters that show inversion symmetry. Here, we show that inversion symmetry is not a prerequisite criterion for a spectrally stable quantum emitter. We find that identical electron density in ground and excited states can eliminate the coupling to the stray electric fields. Further, a strong optical transition dipole moment is achieved in systems with altering sign of the ground and excited wavefunctions. We use density functional perturbation theory to investigate an optical center that lacks of inversion symmetry. Our results show that this system close to ideally satisfies the criteria for an ideal quantum emitter. Our study opens a novel rationale in seeking promising materials and point defects towards the realisation of robust spin-to-photon interfaces.


          Conceptua: Institutions in a Topos. (arXiv:1811.02041v1 [cs.LO])      Cache   Translate Page      

Authors: Robert E. Kent

Tarski's semantic definition of truth is the composition of its extensional and intensional aspects. Abstract satisfaction, the core of the semantic definition of truth, is the basis for the theory of institutions (Goguen and Burstall). The satisfaction relation for first order languages (the truth classification), and the preservation of truth by first order interpretations (the truth infomorphism), form a key motivating example in the theory of Information Flow (IF) (Barwise and Seligman). The concept lattice notion, which is the central structure studied by the theory of Formal Concept Analysis (FCA) (Ganter and Wille), is constructed by the polar factorization of derivation. The study of classification structures (IF) and the study of conceptual structures (FCA) provide a principled foundation for the logical theory of knowledge representation and organization. In an effort to unify these two areas, the paper "Distributed Conceptual Structures" (Kent arXiv:1810.04774) abstracted the basic theorem of FCA in order to established three levels of categorical equivalence between classification structures and conceptual structures. In this paper, we refine this approach by resolving the equivalence as the category-theoretic factorization of the Galois connection of derivation. The equivalence between classification and conceptual structures is mediated by the opposite motions of factorization and composition. Abstract truth factors through the concept lattice of theories in terms of its extensional and intensional aspects.


          An improved exact algorithm and an NP-completeness proof for sparse matrix bipartitioning. (arXiv:1811.02043v1 [cs.DS])      Cache   Translate Page      

Authors: Timon E. Knigge, Rob H. Bisseling

We formulate the sparse matrix bipartitioning problem of minimizing the communication volume in parallel sparse matrix-vector multiplication. We prove its $\mathcal{NP}$-completeness in the perfectly balanced case, where both parts of the partitioned matrix must have an equal number of nonzeros, by reduction from the graph bisection problem.

We present an improved exact branch-and-bound algorithm which finds the minimum communication volume for a given maximum allowed imbalance. The algorithm is based on a maximum-flow bound and a packing bound, which extend previous matching and packing bounds.

We implemented the algorithm in a new program called MP (Matrix Partitioner), which solved 839 matrices from the SuiteSparse collection to optimality, each within 24 hours of CPU-time. Furthermore, MP solved the difficult problem of the matrix cage6 in about 3 days. The new program is about 13.8 times faster than the previous program MondriaanOpt.


          Improving Trajectory Optimization using a Roadmap Framework. (arXiv:1811.02044v1 [cs.RO])      Cache   Translate Page      

Authors: Siyu Dai, Matthew Orton, Shawn Schaffert, Andreas Hofmann, Brian Williams

We present an evaluation of several representative sampling-based and optimization-based motion planners, and then introduce an integrated motion planning system which incorporates recent advances in trajectory optimization into a sparse roadmap framework. Through experiments in 4 common application scenarios with 5000 test cases each, we show that optimization-based or sampling-based planners alone are not effective for realistic problems where fast planning times are required. To the best of our knowledge, this is the first work that presents such a systematic and comprehensive evaluation of state-of-the-art motion planners, which are based on a significant amount of experiments. We then combine different stand-alone planners with trajectory optimization. The results show that the combination of our sparse roadmap and trajectory optimization provides superior performance over other standard sampling-based planners combinations. By using a multi-query roadmap instead of generating completely new trajectories for each planning problem, our approach allows for extensions such as persistent control policy information associated with a trajectory across planning problems. Also, the sub-optimality resulting from the sparsity of roadmap, as well as the unexpected disturbances from the environment, can both be overcome by the real-time trajectory optimization process.


          Non-Local Compressive Sensing Based SAR Tomography. (arXiv:1811.02046v1 [cs.CV])      Cache   Translate Page      

Authors: Yilei Shi, Xiao Xiang Zhu, Richard Bamler

Tomographic SAR (TomoSAR) inversion of urban areas is an inherently sparse reconstruction problem and, hence, can be solved using compressive sensing (CS) algorithms. This paper proposes solutions for two notorious problems in this field: 1) TomoSAR requires a high number of data sets, which makes the technique expensive. However, it can be shown that the number of acquisitions and the signal-to-noise ratio (SNR) can be traded off against each other, because it is asymptotically only the product of the number of acquisitions and SNR that determines the reconstruction quality. We propose to increase SNR by integrating non-local estimation into the inversion and show that a reasonable reconstruction of buildings from only seven interferograms is feasible. 2) CS-based inversion is computationally expensive and therefore barely suitable for large-scale applications. We introduce a new fast and accurate algorithm for solving the non-local L1-L2-minimization problem, central to CS-based reconstruction algorithms. The applicability of the algorithm is demonstrated using simulated data and TerraSAR-X high-resolution spotlight images over an area in Munich, Germany.


          Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation. (arXiv:1811.02050v1 [cs.CL])      Cache   Translate Page      

Authors: Ye Jia, Melvin Johnson, Wolfgang Macherey, Ron J. Weiss, Yuan Cao, Chung-Cheng Chiu, Naveen Ari, Stella Laurenzo, Yonghui Wu

End-to-end Speech Translation (ST) models have many potential advantages when compared to the cascade of Automatic Speech Recognition (ASR) and text Machine Translation (MT) models, including lowered inference latency and the avoidance of error compounding. However, the quality of end-to-end ST is often limited by a paucity of training data, since it is difficult to collect large parallel corpora of speech and translated transcript pairs. Previous studies have proposed the use of pre-trained components and multi-task learning in order to benefit from weakly supervised training data, such as speech-to-transcript or text-to-foreign-text pairs. In this paper, we demonstrate that using pre-trained MT or text-to-speech (TTS) synthesis models to convert weakly supervised data into speech-to-translation pairs for ST training can be more effective than multi-task learning. Furthermore, we demonstrate that a high quality end-to-end ST model can be trained using only weakly supervised datasets, and that synthetic data sourced from unlabeled monolingual text or speech can be used to improve performance. Finally, we discuss methods for avoiding overfitting to synthetic speech with a quantitative ablation study.


          Managing engineering systems with large state and action spaces through deep reinforcement learning. (arXiv:1811.02052v1 [cs.SY])      Cache   Translate Page      

Authors: C.P. Andriotis, K.G. Papakonstantinou

Decision-making for engineering systems can be efficiently formulated as a Markov Decision Process (MDP) or a Partially Observable MDP (POMDP). Typical MDP and POMDP solution procedures utilize offline knowledge about the environment and provide detailed policies for relatively small systems with tractable state and action spaces. However, in large multi-component systems the sizes of these spaces easily explode, as system states and actions scale exponentially with the number of components, whereas environment dynamics are difficult to be described in explicit forms for the entire system and may only be accessible through numerical simulators. In this work, to address these issues, an integrated Deep Reinforcement Learning (DRL) framework is introduced. The Deep Centralized Multi-agent Actor Critic (DCMAC) is developed, an off-policy actor-critic DRL approach, providing efficient life-cycle policies for large multi-component systems operating in high-dimensional spaces. Apart from deep function approximations that parametrize large state spaces, DCMAC also adopts a factorized representation of the system actions, being able to designate individualized component- and subsystem-level decisions, while maintaining a centralized value function for the entire system. DCMAC compares well against Deep Q-Network (DQN) solutions and exact policies, where applicable, and outperforms optimized baselines that are based on time-based, condition-based and periodic policies.


          Throughput-based Design for Polar Coded-Modulation. (arXiv:1811.02053v1 [cs.IT])      Cache   Translate Page      

Authors: Hossein Khoshnevis, Ian Marsland, Halim Yanikomeroglu

Typically, forward error correction (FEC) codes are designed based on the minimization of the error rate for a given code rate. However, for applications that incorporate hybrid automatic repeat request (HARQ) protocol and adaptive modulation and coding, the throughput is a more important performance metric than the error rate. Polar codes, a new class of FEC codes with simple rate matching, can be optimized efficiently for maximization of the throughput. In this paper, we aim to design HARQ schemes using multilevel polar coded-modulation (MLPCM). Thus, we first develop a method to determine a set-partitioning based bit-to-symbol mapping for high order QAM constellations. We simplify the LLR estimation of set-partitioned QAM constellations for a multistage decoder, and we introduce a set of algorithms to design throughput-maximizing MLPCM for the successive cancellation decoding (SCD). These codes are specifically useful for non-combining (NC) and Chase-combining (CC) HARQ protocols. Furthermore, since optimized codes for SCD are not optimal for SC list decoders (SCLD), we propose a rate matching algorithm to find the best rate for SCLD while using the polar codes optimized for SCD. The resulting codes provide throughput close to the capacity with low decoding complexity when used with NC or CC HARQ.


          Model Extraction and Active Learning. (arXiv:1811.02054v1 [cs.LG])      Cache   Translate Page      

Authors: Varun Chandrasekaran, Kamalika Chaudhuri, Irene Giacomelli, Somesh Jha, Songbai Yan

Machine learning is being increasingly used by individuals, research institutions, and corporations. This has resulted in the surge of Machine Learning-as-a-Service (MLaaS) - cloud services that provide (a) tools and resources to learn the model, and (b) a user-friendly query interface to access the model. However, such MLaaS systems raise privacy concerns, one being model extraction. Adversaries maliciously exploit the query interface to steal the model. More precisely, in a model extraction attack, a good approximation of a sensitive or proprietary model held by the server is extracted (i.e. learned) by a dishonest user. Such a user only sees the answers to select queries sent using the query interface. This attack was recently introduced by Tramer et al. at the 2016 USENIX Security Symposium, where practical attacks for different models were shown. We believe that better understanding the efficacy of model extraction attacks is paramount in designing better privacy-preserving MLaaS systems. To that end, we take the first step by (a) formalizing model extraction and proposing the first definition of extraction defense, and (b) drawing parallels between model extraction and the better investigated active learning framework. In particular, we show that recent advancements in the active learning domain can be used to implement both model extraction, and defenses against such attacks.


          The Marchex 2018 English Conversational Telephone Speech Recognition System. (arXiv:1811.02058v1 [cs.CL])      Cache   Translate Page      

Authors: Seongjun Hahm, Iroro Orife, Shane Walker, Jason Flaks

In this paper, we describe recent improvements to the production Marchex speech recognition system for our spontaneous customer-to-business telephone conversations. We outline our semi-supervised lattice-free maximum mutual information (LF-MMI) training process which can supervise over full lattices from unlabeled audio. We also elaborate on production-scale text selection techniques for constructing very large conversational language models (LMs). On Marchex English (ME), a modern evaluation set of conversational North American English, for acoustic modeling we report a 3.3% ({agent, caller}:{3.2%, 3.6%}) reduction in absolute word error rate (WER). For language modeling, we observe a separate {1.3%, 1.2%} point reduction on {agent, caller} utterances respectively over the performance of the 2017 production system.


          STAR: Scaling Transactions through Asymmetrical Replication. (arXiv:1811.02059v1 [cs.DB])      Cache   Translate Page      

Authors: Yi Lu, Xiangyao Yu, Samuel Madden

In this paper, we present STAR, a new distributed and replicated in-memory database. By employing a single-node non-partitioned architecture for some replicas and a partitioned architecture for other replicas, STAR is able to efficiently run both highly partitionable workloads and workloads that involve cross-partition transactions. The key idea is a new phase-switching algorithm where the execution of single-partition and cross-partition transactions are separated. In the partitioned phase, single-partition transactions are run on multiple machines in parallel to exploit more concurrency. In the single-master phase, mastership for the entire database is switched to a designated coordinator node, which can execute these transactions without the use of expensive coordination protocols like two-phase commit. Because the coordinator node has a full copy of the database, this phase-switching can be done at negligible cost. Our experiments on two popular benchmarks (YCSB and TPC-C) show that high availability via replication can coexist with fast serializable transaction execution in distributed in-memory databases, with STAR outperforming systems that employ conventional concurrency control and replication algorithms by up to one order of magnitude.


          A Recurrent Graph Neural Network for Multi-Relational Data. (arXiv:1811.02061v1 [cs.LG])      Cache   Translate Page      

Authors: Vassilis N. Ioannidis, Antonio G. Marques, Georgios B. Giannakis

The era of data deluge has sparked the interest in graph-based learning methods in a number of disciplines such as sociology, biology, neuroscience, or engineering. In this paper, we introduce a graph recurrent neural network (GRNN) for scalable semi-supervised learning from multi-relational data. Key aspects of the novel GRNN architecture are the use of multi-relational graphs, the dynamic adaptation to the different relations via learnable weights, and the consideration of graph-based regularizers to promote smoothness and alleviate over-parametrization. Our ultimate goal is to design a powerful learning architecture able to: discover complex and highly non-linear data associations, combine (and select) multiple types of relations, and scale gracefully with respect to the size of the graph. Numerical tests with real data sets corroborate the design goals and illustrate the performance gains relative to competing alternatives.


          End-to-End Monaural Multi-speaker ASR System without Pretraining. (arXiv:1811.02062v1 [cs.CL])      Cache   Translate Page      

Authors: Xuankai Chang, Yanmin Qian, Kai Yu, Shinji Watanabe

Recently, end-to-end models have become a popular approach as an alternative to traditional hybrid models in automatic speech recognition (ASR). The multi-speaker speech separation and recognition task is a central task in cocktail party problem. In this paper, we present a state-of-the-art monaural multi-speaker end-to-end automatic speech recognition model. In contrast to previous studies on the monaural multi-speaker speech recognition, this end-to-end framework is trained to recognize multiple label sequences completely from scratch. The system only requires the speech mixture and corresponding label sequences, without needing any indeterminate supervisions obtained from non-mixture speech or corresponding labels/alignments. Moreover, we exploited using the individual attention module for each separated speaker and the scheduled sampling to further improve the performance. Finally, we evaluate the proposed model on the 2-speaker mixed speech generated from the WSJ corpus and the wsj0-2mix dataset, which is a speech separation and recognition benchmark. The experiments demonstrate that the proposed methods can improve the performance of the end-to-end model in separating the overlapping speech and recognizing the separated streams. From the results, the proposed model leads to ~10.0% relative performance gains in terms of CER and WER respectively.


          When CTC Training Meets Acoustic Landmarks. (arXiv:1811.02063v1 [eess.AS])      Cache   Translate Page      

Authors: Di He, Xuesong Yang, Boon Pang Lim, Yi Liang, Mark Hasegawa-Johnson, Deming Chen

Connectionist temporal classification (CTC) training criterion provides an alternative acoustic model (AM) training strategy for automatic speech recognition in an end-to-end fashion. Although CTC criterion benefits acoustic modeling without needs of time-aligned phonetics transcription, it remains in need of efforts of tweaking to convergence, especially in the resource-constrained scenario. In this paper, we proposed to improve CTC training by incorporating acoustic landmarks. We tailored a new set of acoustic landmarks to help CTC training converge more quickly while also reducing recognition error rates. We leveraged new target label sequences mixed with both phone and manner changes to guide CTC training. Experiments on TIMIT demonstrated that CTC based acoustic models converge faster and smoother significantly when they are augmented by acoustic landmarks. The models pretrained with mixed target labels can be finetuned furthermore, which reduced phone error rate by 8.72% on TIMIT. The consistent performance gain is also observed on reduced TIMIT and WSJ as well, in which case, we are the first to succeed in testing the effectiveness of acoustic landmark theory on mid-sized ASR tasks.


          How to Improve Your Speaker Embeddings Extractor in Generic Toolkits. (arXiv:1811.02066v1 [cs.SD])      Cache   Translate Page      

Authors: Hossein Zeinali, Lukas Burget, Johan Rohdin, Themos Stafylakis, Jan Cernocky

Recently, speaker embeddings extracted with deep neural networks became the state-of-the-art method for speaker verification. In this paper we aim to facilitate its implementation on a more generic toolkit than Kaldi, which we anticipate to enable further improvements on the method. We examine several tricks in training, such as the effects of normalizing input features and pooled statistics, different methods for preventing overfitting as well as alternative non-linearities that can be used instead of Rectifier Linear Units. In addition, we investigate the difference in performance between TDNN and CNN, and between two types of attention mechanism. Experimental results on Speaker in the Wild, SRE 2016 and SRE 2018 datasets demonstrate the effectiveness of the proposed implementation.


          Generalization Bounds for Neural Networks: Kernels, Symmetry, and Sample Compression. (arXiv:1811.02067v1 [cs.LG])      Cache   Translate Page      

Authors: Christopher Snyder, Sriram Vishwanath

Though Deep Neural Networks (DNNs) are widely celebrated for their practical performance, they demonstrate many intriguing phenomena related to depth that are difficult to explain both theoretically and intuitively. Understanding how weights in deep networks coordinate together across layers to form useful learners has proven somewhat intractable, in part because of the repeated composition of nonlinearities induced by depth. We present a reparameterization of DNNs as a linear function of a particular feature map that is locally independent of the weights. This feature map transforms depth-dependencies into simple {\em tensor} products and maps each input to a discrete subset of the feature space. Then, in analogy with logistic regression, we propose a max-margin assumption that enables us to present a so-called {\em sample compression} representation of the neural network in terms of the discrete activation state of neurons induced by s "support vectors". We show how the number of support vectors relate to learning guarantees for neural networks through sample compression bounds, yielding a sample complexity O(ns/\epsilon) for networks with n neurons. Additionally, this number of support vectors has monotonic dependence on width, depth, and label noise for simple networks trained on the MNIST dataset.


          False Analog Data Injection Attack Towards Topology Errors: Formulation and Feasibility Analysis. (arXiv:1811.02068v1 [cs.SY])      Cache   Translate Page      

Authors: Yuqi Zhou, Jorge Cisneros-Saldana, Le Xie

In this paper, we propose a class of false analog data injection attack that can misguide the system as if topology errors had occurred. By utilizing the measurement redundancy with respect to the state variables, the adversary who knows the system configuration is shown to be capable of computing the corresponding measurement value with the intentionally misguided topology. The attack is designed such that the state as well as residue distribution after state estimation will converge to those in the system with a topology error. It is shown that the attack can be launched even if the attacker is constrained to some specific meters. The attack is detrimental to the system since manipulation of analog data will lead to a forged digital topology status, and the state after the error is identified and modified will be significantly biased with the intended wrong topology. The feasibility of the proposed attack is demonstrated with an IEEE 14-bus system.


          Blind Two-Dimensional Super-Resolution and Its Performance Guarantee. (arXiv:1811.02070v1 [cs.IT])      Cache   Translate Page      

Authors: Mohamed A. Suliman, Wei Dai

Super-resolution techniques are concerned with extracting fine-scale data from low-resolution information. In this work, we study the problem of identifying the parameters of a linear system from its response to multiple unknown input waveforms. We assume that the system response, which is the only given information, is a scaled superposition of time-delayed and frequency-shifted versions of the unknown waveforms. Such kind of problem is severely ill-posed and does not yield a solution without introducing further constraints. To fully characterize the linear system, we assume that the unknown waveforms lie in a common known low-dimensional subspace that satisfies certain randomness and concentration properties. Then, we develop a blind two-dimensional (2D) super-resolution framework that applies to a large number of applications such as radar imaging, image restoration, and indoor source localization. In this framework, we show that under a minimum separation condition between the time-frequency shifts, all the unknowns that characterize the linear system can be recovered precisely and with very high probability provided that a lower bound on the total number of the observed samples is satisfied. The proposed framework is based on 2D atomic norm minimization problem which is shown to be reformulated and solved efficiently via semidefinite programming. Simulation results that confirm the theoretical findings of the paper are provided.


          Scale-free Networks Well Done. (arXiv:1811.02071v1 [physics.soc-ph])      Cache   Translate Page      

Authors: Ivan Voitalov, Pim van der Hoorn, Remco van der Hofstad, Dmitri Krioukov

We bring rigor to the vibrant activity of detecting power laws in empirical degree distributions in real-world networks. We first provide a rigorous definition of power-law distributions, equivalent to the definition of regularly varying distributions in statistics. This definition allows the distribution to deviate from a pure power law arbitrarily but without affecting the power-law tail exponent. We then identify three estimators of these exponents that are proven to be statistically consistent -- that is, converging to the true exponent value for any regularly varying distribution -- and that satisfy some additional niceness requirements. Finally, we apply these estimators to a representative collection of synthetic and real-world data. According to their estimates, real-world scale-free networks are definitely not as rare as one would conclude based on the popular but unrealistic assumption that real-world data comes from power laws of pristine purity, void of noise and deviations.


          QUOTA: The Quantile Option Architecture for Reinforcement Learning. (arXiv:1811.02073v1 [cs.LG])      Cache   Translate Page      

Authors: Shangtong Zhang, Borislav Mavrin, Hengshuai Yao, Linglong Kong, Bo Liu

In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL). In QUOTA, decision making is based on quantiles of a value distribution, not only the mean. QUOTA provides a new dimension for exploration via making use of both optimism and pessimism of a value distribution. We demonstrate the performance advantage of QUOTA in both challenging video games and physical robot simulators.


          Leveraging Virtual and Real Person for Unsupervised Person Re-identification. (arXiv:1811.02074v1 [cs.CV])      Cache   Translate Page      

Authors: Fengxiang Yang, Zhun Zhong, Zhiming Luo, Sheng Lian, Shaozi Li

Person re-identification (re-ID) is a challenging problem especially when no labels are available for training. Although recent deep re-ID methods have achieved great improvement, it is still difficult to optimize deep re-ID model without annotations in training data. To address this problem, this study introduces a novel approach for unsupervised person re-ID by leveraging virtual and real data. Our approach includes two components: virtual person generation and training of deep re-ID model. For virtual person generation, we learn a person generation model and a camera style transfer model using unlabeled real data to generate virtual persons with different poses and camera styles. The virtual data is formed as labeled training data, enabling subsequently training deep re-ID model in supervision. For training of deep re-ID model, we divide it into three steps: 1) pre-training a coarse re-ID model by using virtual data; 2) collaborative filtering based positive pair mining from the real data; and 3) fine-tuning of the coarse re-ID model by leveraging the mined positive pairs and virtual data. The final re-ID model is achieved by iterating between step 2 and step 3 until convergence. Experimental results on two large-scale datasets, Market-1501 and DukeMTMC-reID, demonstrate the effectiveness of our approach and shows that the state of the art is achieved in unsupervised person re-ID.


          Improving Span-based Question Answering Systems with Coarsely Labeled Data. (arXiv:1811.02076v1 [cs.CL])      Cache   Translate Page      

Authors: Hao Cheng, Ming-Wei Chang, Kenton Lee, Ankur Parikh, Michael Collins, Kristina Toutanova

We study approaches to improve fine-grained short answer Question Answering models by integrating coarse-grained data annotated for paragraph-level relevance and show that coarsely annotated data can bring significant performance gains. Experiments demonstrate that the standard multi-task learning approach of sharing representations is not the most effective way to leverage coarse-grained annotations. Instead, we can explicitly model the latent fine-grained short answer variables and optimize the marginal log-likelihood directly or use a newly proposed \emph{posterior distillation} learning objective. Since these latent-variable methods have explicit access to the relationship between the fine and coarse tasks, they result in significantly larger improvements from coarse supervision.


          Optimal Succinct Rank Data Structure via Approximate Nonnegative Tensor Decomposition. (arXiv:1811.02078v1 [cs.DS])      Cache   Translate Page      

Authors: Huacheng Yu

Given an $n$-bit array $A$, the succinct rank data structure problem asks to construct a data structure using space $n+r$ bits for $r\ll n$, supporting rank queries of form $\mathtt{rank}(x)=\sum_{i=0}^{x-1} A[i]$. In this paper, we design a new succinct rank data structure with $r=n/(\log n)^{\Omega(t)}+n^{1-c}$ and query time $O(t)$ for some constant $c>0$, improving the previous best-known by Patrascu [Pat08], which has $r=n/(\frac{\log n}{t})^{\Omega(t)}+\tilde{O}(n^{3/4})$ bits of redundancy. For $r>n^{1-c}$, our space-time tradeoff matches the cell-probe lower bound by Patrascu and Viola [PV10], which asserts that $r$ must be at least $n/(\log n)^{O(t)}$. Moreover, one can avoid an $n^{1-c}$-bit lookup table when the data structure is implemented in the cell-probe model, achieving $r=\lceil n/(\log n)^{\Omega(t)}\rceil$. It matches the lower bound for the full range of parameters.

En route to our new data structure design, we establish an interesting connection between succinct data structures and approximate nonnegative tensor decomposition. Our connection shows that for specific problems, to construct a space-efficient data structure, it suffices to approximate a particular tensor by a sum of (few) nonnegative rank-$1$ tensors. For the rank problem, we explicitly construct such an approximation, which yields an explicit construction of the data structure.


          Mesh-TensorFlow: Deep Learning for Supercomputers. (arXiv:1811.02084v1 [cs.LG])      Cache   Translate Page      

Authors: Noam Shazeer, Youlong Cheng, Niki Parmar, Dustin Tran, Ashish Vaswani, Penporn Koanantakool, Peter Hawkins, HyoukJoong Lee, Mingsheng Hong, Cliff Young, Ryan Sepassi, Blake Hechtman

Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting suffers from problems including the inability to train very large models (due to memory constraints), high latency, and inefficiency at small batch sizes. All of these can be solved by more general distribution strategies (model-parallelism). Unfortunately, efficient model-parallel algorithms tend to be complicated to discover, describe, and to implement, particularly on large clusters. We introduce Mesh-TensorFlow, a language for specifying a general class of distributed tensor computations. Where data-parallelism can be viewed as splitting tensors and operations along the "batch" dimension, in Mesh-TensorFlow, the user can specify any tensor-dimensions to be split across any dimensions of a multi-dimensional mesh of processors. A Mesh-TensorFlow graph compiles into a SPMD program consisting of parallel operations coupled with collective communication primitives such as Allreduce. We use Mesh-TensorFlow to implement an efficient data-parallel, model-parallel version of the Transformer sequence-to-sequence model. Using TPU meshes of up to 512 cores, we train Transformer models with up to 5 billion parameters, surpassing state of the art results on WMT'14 English-to-French translation task and the one-billion-word language modeling benchmark. Mesh-Tensorflow is available at https://github.com/tensorflow/mesh .


          Motif and Hypergraph Correlation Clustering. (arXiv:1811.02089v1 [cs.DS])      Cache   Translate Page      

Authors: Pan Li, Gregory J. Puleo, Olgica Milenkovic

Motivated by applications in social and biological network analysis, we introduce a new form of agnostic clustering termed~\emph{motif correlation clustering}, which aims to minimize the cost of clustering errors associated with both edges and higher-order network structures. The problem may be succinctly described as follows: Given a complete graph $G$, partition the vertices of the graph so that certain predetermined `important' subgraphs mostly lie within the same cluster, while `less relevant' subgraphs are allowed to lie across clusters. Our contributions are as follows: We first introduce several variants of motif correlation clustering and then show that these clustering problems are NP-hard. We then proceed to describe polynomial-time clustering algorithms that provide constant approximation guarantees for the problems at hand. Despite following the frequently used LP relaxation and rounding procedure, the algorithms involve a sophisticated and carefully designed neighborhood growing step that combines information about both edge and motif structures. We conclude with several examples illustrating the performance of the developed algorithms on synthetic and real networks.


          Classification of 12-Lead ECG Signals with Bi-directional LSTM Network. (arXiv:1811.02090v1 [cs.CV])      Cache   Translate Page      

Authors: Ahmed Mostayed, Junye Luo, Xingliang Shu, William Wee

We propose a recurrent neural network classifier to detect pathologies in 12-lead ECG signals and train and validate the classifier with the Chinese physiological signal challenge dataset (this http URL). The recurrent neural network consists of two bi-directional LSTM layers and can train on arbitrary-length ECG signals. Our best trained model achieved an average F1 score of 74.15% on the validation set.

Keywords: ECG classification, Deep learning, RNN, Bi-directional LSTM, QRS detection.


          Simple, Distributed, and Accelerated Probabilistic Programming. (arXiv:1811.02091v1 [stat.ML])      Cache   Translate Page      

Authors: Dustin Tran, Matthew Hoffman, Dave Moore, Christopher Suter, Srinivas Vasudevan, Alexey Radul, Matthew Johnson, Rif A. Saurous

We describe a simple, low-level approach for embedding probabilistic programming in a deep learning ecosystem. In particular, we distill probabilistic programming down to a single abstraction---the random variable. Our lightweight implementation in TensorFlow enables numerous applications: a model-parallel variational auto-encoder (VAE) with 2nd-generation tensor processing units (TPUv2s); a data-parallel autoregressive model (Image Transformer) with TPUv2s; and multi-GPU No-U-Turn Sampler (NUTS). For both a state-of-the-art VAE on 64x64 ImageNet and Image Transformer on 256x256 CelebA-HQ, our approach achieves an optimal linear speedup from 1 to 256 TPUv2 chips. With NUTS, we see a 100x speedup on GPUs over Stan and 37x over PyMC3.


          Kernel Machines Beat Deep Neural Networks on Mask-based Single-channel Speech Enhancement. (arXiv:1811.02095v1 [cs.LG])      Cache   Translate Page      

Authors: Like Hui, Siyuan Ma, Mikhail Belkin

We apply a fast kernel method for mask-based single-channel speech enhancement. Specifically, our method solves a kernel regression problem associated to a non-smooth kernel function (exponential power kernel) with a highly efficient iterative method (EigenPro). Due to the simplicity of this method, its hyper-parameters such as kernel bandwidth can be automatically and efficiently selected using line search with subsamples of training data. We observe an empirical correlation between the regression loss (mean square error) and regular metrics for speech enhancement. This observation justifies our training target and motivates us to achieve lower regression loss by training separate kernel model per frequency subband. We compare our method with the state-of-the-art deep neural networks on mask-based HINT and TIMIT. Experimental results show that our kernel method consistently outperforms deep neural networks while requiring less training time.


          Scale calibration for high-dimensional robust regression. (arXiv:1811.02096v1 [math.ST])      Cache   Translate Page      

Authors: Po-Ling Loh

We present a new method for high-dimensional linear regression when a scale parameter of the additive errors is unknown. The proposed estimator is based on a penalized Huber $M$-estimator, for which theoretical results on estimation error have recently been proposed in high-dimensional statistics literature. However, the variance of the error term in the linear model is intricately connected to the optimal parameter used to define the shape of the Huber loss. Our main idea is to use an adaptive technique, based on Lepski's method, to overcome the difficulties in solving a joint nonconvex optimization problem with respect to the location and scale parameters.


          Image-Based Reconstruction for a 3D-PFHS Heat Transfer Problem by ReConNN. (arXiv:1811.02102v1 [cs.CE])      Cache   Translate Page      

Authors: Yu Li, Hu Wang, Xinjian Deng

The heat transfer performance of Plate Fin Heat Sink (PFHS) has been investigated experimentally and extensively. Commonly, the objective function of PFHS design is based on the responses of simulations. Compared with existing studies, the purpose of this work is to transfer from image-based model to analysis-based model for heat sink designs. It means that the sequential optimization should be based on images instead of responses. Therefore, an image-based reconstruction model of a heat transfer process for a 3D-PFHS is established. Unlike image recognition, such procedure cannot be implemented by existing recognition algorithms (e.g. Convolutional Neural Network) directly. Therefore, a Reconstructive Neural Network (ReConNN), integrated supervised learning and unsupervised learning techniques, is suggested. According to the experimental results, the heat transfer process can be observed more detailed and clearly, and the reconstructed results are meaningful for the further optimizations.


          On the role of neurogenesis in overcoming catastrophic forgetting. (arXiv:1811.02113v1 [cs.NE])      Cache   Translate Page      

Authors: German I. Parisi, Xu Ji, Stefan Wermter

Lifelong learning capabilities are crucial for artificial autonomous agents operating on real-world data, which is typically non-stationary and temporally correlated. In this work, we demonstrate that dynamically grown networks outperform static networks in incremental learning scenarios, even when bounded by the same amount of memory in both cases. Learning is unsupervised in our models, a condition that additionally makes training more challenging whilst increasing the realism of the study, since humans are able to learn without dense manual annotation. Our results on artificial neural networks reinforce that structural plasticity constitutes effective prevention against catastrophic forgetting in non-stationary environments, as well as empirically supporting the importance of neurogenesis in the mammalian brain.


          DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences. (arXiv:1811.02114v1 [q-bio.QM])      Cache   Translate Page      

Authors: Ingoo Lee, Jongsoo Keum, Hojung Nam

Identification of drug-target interactions (DTIs) plays a key role in drug discovery. The high cost and labor-intensive nature of in vitro and in vivo experiments have highlighted the importance of in silico-based DTI prediction approaches. In several computational models, conventional protein descriptors are shown to be not informative enough to predict accurate DTIs. Thus, in this study, we employ a convolutional neural network (CNN) on raw protein sequences to capture local residue patterns participating in DTIs. With CNN on protein sequences, our model performs better than previous protein descriptor-based models. In addition, our model performs better than the previous deep learning model for massive prediction of DTIs. By examining the pooled convolution results, we found that our model can detect binding sites of proteins for DTIs. In conclusion, our prediction model for detecting local residue patterns of target proteins successfully enriches the protein features of a raw protein sequence, yielding better prediction results than previous approaches.


          Modeling and Predicting Popularity Dynamics via Deep Learning Attention Mechanism. (arXiv:1811.02117v1 [cs.SI])      Cache   Translate Page      

Authors: Sha Yuan, Yu Zhang, Jie Tang, Huawei Shen, Xingxing Wei

An ability to predict the popularity dynamics of individual items within a complex evolving system has important implications in a wide range of domains. Here we propose a deep learning attention mechanism to model the process through which individual items gain their popularity. We analyze the interpretability of the model with the four key phenomena confirmed independently in the previous studies of long-term popularity dynamics quantification, including the intrinsic quality, the aging effect, the recency effect and the Matthew effect. We analyze the effectiveness of introducing attention model in popularity dynamics prediction. Extensive experiments on a real-large citation data set demonstrate that the designed deep learning attention mechanism possesses remarkable power at predicting the long-term popularity dynamics. It consistently outperforms the existing methods, and achieves a significant performance improvement.


          Motion Planning for a UAV with a Straight or Kinked Tether. (arXiv:1811.02119v1 [cs.RO])      Cache   Translate Page      

Authors: Xuesu Xiao, Jan Dufek, Mohamed Suhail, Robin Murphy

This paper develops and compares two motion planning algorithms for a tethered UAV with and without the possibility of the tether contacting the confined and cluttered environment. Tethered aerial vehicles have been studied due to their advantages such as power duration, stability, and safety. However, the disadvantages brought in by the extra tether have not been well investigated by the robotic locomotion community, especially when the tethered agent is locomoting in a non-free space occupied with obstacles. In this work, we propose two motion planning frameworks that (1) reduce the reachable configuration space by taking into account the tether and (2) deliberately plan (and relax) the contact point(s) of the tether with the environment and enable an equivalent reachable configuration space as the non-tethered counterpart would have. Both methods are tested on a physical robot, Fotokite Pro. With our approaches, tethered aerial vehicles could find their applications in confined and cluttered environments with obstacles as opposed to ideal free space, while still maintaining the advantages from the usage of a tether. The motion planning strategies are particularly suitable for marsupial heterogeneous robotic teams, such as visual servoing/assisting for another mobile, tele-operated primary robot.


          Digital Signature Security in Data Communication. (arXiv:1811.02120v1 [cs.CR])      Cache   Translate Page      

Authors: Robbi Rahim, Andri Pranolo, Ronal Hadi, Rasyidah, Heri Nurdiyanto, Darmawan Napitupulu, Ansari Saleh Ahmar, Leon Andretti Abdillah, Dahlan Abdullah

Authenticity of access in very information are very important in the current era of Internet-based technology, there are many ways to secure information from irresponsible parties with various security attacks, some of technique can use for defend attack from irresponsible parties are using steganography, cryptography or also use digital signatures. Digital signatures could be one of solution where the authenticity of the message will be verified to prove that the received message is the original message without any change, Ong-Schnorr-Shamir is the algorithm are used in this research and the experiment are perform on the digital signature scheme and the hidden channel scheme.


          Robust and fine-grained prosody control of end-to-end speech synthesis. (arXiv:1811.02122v1 [cs.CL])      Cache   Translate Page      

Authors: Younggun Lee, Taesu Kim

We propose prosody embeddings for emotional and expressive speech synthesis networks. The proposed methods introduce temporal structures in the embedding networks, which enable fine-grained control of the speaking style of the synthesized speech. The temporal structures could be designed either in speech-side or text-side, which lead different control resolution in time. The prosody embedding networks are plugged into end-to-end speech synthesis networks, and trained without any other supervision except the target speech for synthesizing. The prosody embedding networks learned to extract prosodic features. By adjusting the learned prosody features, we could change the pitch and amplitude of the synthesized speech both in frame level and phoneme level. We also introduce temporal normalization of prosody embeddings, which shows better robustness against speaker perturbation in prosody transfer tasks.


          Modeling and Predicting Citation Count via Recurrent Neural Network with Long Short-Term Memory. (arXiv:1811.02129v1 [cs.DL])      Cache   Translate Page      

Authors: Sha Yuan, Jie Tang, Yu Zhang, Yifan Wang, Tong Xiao

The rapid evolution of scientific research has been creating a huge volume of publications every year. Among the many quantification measures of scientific impact, citation count stands out for its frequent use in the research community. Although peer review process is the mainly reliable way of predicting a paper's future impact, the ability to foresee lasting impact on the basis of citation records is increasingly important in the scientific impact analysis in the era of big data. This paper focuses on the long-term citation count prediction for individual publications, which has become an emerging and challenging applied research topic. Based on the four key phenomena confirmed independently in previous studies of long-term scientific impact quantification, including the intrinsic quality of publications, the aging effect and the Matthew effect and the recency effect, we unify the formulations of all these observations in this paper. Building on a foundation of the above formulations, we propose a long-term citation count prediction model for individual papers via recurrent neural network with long short-term memory units. Extensive experiments on a real-large citation data set demonstrate that the proposed model consistently outperforms existing methods, and achieves a significant performance improvement.


          Bootstrapping single-channel source separation via unsupervised spatial clustering on stereo mixtures. (arXiv:1811.02130v1 [cs.SD])      Cache   Translate Page      

Authors: Prem Seetharaman, Gordon Wichern, Jonathan Le Roux, Bryan Pardo

Separating an audio scene into isolated sources is a fundamental problem in computer audition, analogous to image segmentation in visual scene analysis. Source separation systems based on deep learning are currently the most successful approaches for solving the underdetermined separation problem, where there are more sources than channels. Traditionally, such systems are trained on sound mixtures where the ground truth decomposition is already known. Since most real-world recordings do not have such a decomposition available, this limits the range of mixtures one can train on, and the range of mixtures the learned models may successfully separate. In this work, we use a simple blind spatial source separation algorithm to generate estimated decompositions of stereo mixtures. These estimates, together with a weighting scheme in the time-frequency domain, based on confidence in the separation quality, are used to train a deep learning model that can be used for single-channel separation, where no source direction information is available. This demonstrates how a simple cue such as the direction of origin of source can be used to bootstrap a model for source separation that can be used in situations where that cue is not available.


          Student's t-Generative Adversarial Networks. (arXiv:1811.02132v1 [cs.LG])      Cache   Translate Page      

Authors: Jinxuan Sun, Guoqiang Zhong, Yang Chen, Yongbin Liu, Tao Li, Zhongwen Guo

Generative Adversarial Networks (GANs) have a great performance in image generation, but they need a large scale of data to train the entire framework, and often result in nonsensical results. We propose a new method referring to conditional GAN, which equipments the latent noise with mixture of Student's t-distribution with attention mechanism in addition to class information. Student's t-distribution has long tails that can provide more diversity to the latent noise. Meanwhile, the discriminator in our model implements two tasks simultaneously, judging whether the images come from the true data distribution, and identifying the class of each generated images. The parameters of the mixture model can be learned along with those of GANs. Moreover, we mathematically prove that any multivariate Student's t-distribution can be obtained by a linear transformation of a normal multivariate Student's t-distribution. Experiments comparing the proposed method with typical GAN, DeliGAN and DCGAN indicate that, our method has a great performance on generating diverse and legible objects with limited data.


          On the Termination Problem for Probabilistic Higher-Order Recursive Programs. (arXiv:1811.02133v1 [cs.PL])      Cache   Translate Page      

Authors: Naoki Kobayashi, Ugo Dal Lago, Charles Grellois

In the last two decades, there has been much progress on model checking of both probabilistic systems and higher-order programs. In spite of the emergence of higher-order probabilistic programming languages, not much has been done to combine those two approaches. In this paper, we initiate a study on the probabilistic higher-order model checking problem, by giving some first theoretical and experimental results. As a first step towards our goal, we introduce PHORS, a probabilistic extension of higher-order recursion schemes (HORS), as a model of probabilistic higher-order programs. The model of PHORS may alternatively be viewed as a higher-order extension of recursive Markov chains. We then investigate the probabilistic termination problem --- or, equivalently, the probabilistic reachability problem. We prove that almost sure termination of order-2 PHORS is undecidable. We also provide a fixpoint characterization of the termination probability of PHORS, and develop a sound (but possibly incomplete) procedure for approximately computing the termination probability. We have implemented the procedure for order-2 PHORSs, and confirmed that the procedure works well through preliminary experiments that are reported at the end of the article.


          Transfer learning of language-independent end-to-end ASR with language model fusion. (arXiv:1811.02134v1 [cs.CL])      Cache   Translate Page      

Authors: Hirofumi Inaguma, Jaejin Cho, Murali Karthick Baskar, Tatsuya Kawahara, Shinji Watanabe

This work explores better adaptation methods to low-resource languages using an external language model (LM) under the framework of transfer learning. We first build a language-independent ASR system in a unified sequence-to-sequence (S2S) architecture with a shared vocabulary among all languages. During adaptation, we perform LM fusion transfer, where an external LM is integrated into the decoder network of the attention-based S2S model in the whole adaptation stage, to effectively incorporate linguistic context of the target language. We also investigate various seed models for transfer learning. Experimental evaluations using the IARPA BABEL data set show that LM fusion transfer improves performances on all target five languages compared with simple transfer learning when the external text data is available. Our final system drastically reduces the performance gap from the hybrid systems.


          Extended Isolation Forest. (arXiv:1811.02141v1 [cs.LG])      Cache   Translate Page      

Authors: Sahand Hariri, Matias Carrasco Kind, Robert J. Brunner

We present an extension to the model-free anomaly detection algorithm, Isolation Forest. This extension, named Extended Isolation Forest (EIF), improves the consistency and reliability of the anomaly score produced for a given data point. We show that the standard Isolation Forest produces inconsistent scores using score maps. The score maps suffer from an artifact generated as a result of how the criteria for branching operation of the binary tree is selected. We propose two different approaches for improving the situation. First we propose transforming the data randomly before creation of each tree, which results in averaging out the bias introduced in the algorithm. Second, which is the preferred way, is to allow the slicing of the data to use hyperplanes with random slopes. This approach results in improved score maps. We show that the consistency and reliability of the algorithm is much improved using this method by looking at the variance of scores of data points distributed along constant score lines. We find no appreciable difference in the rate of convergence nor in computation time between the standard Isolation Forest and EIF, which highlights its potential as anomaly detection algorithm.


          Erasure coding for distributed matrix multiplication for matrices with bounded entries. (arXiv:1811.02144v1 [cs.DC])      Cache   Translate Page      

Authors: Li Tang, Kostas Konstantinidis, Aditya Ramamoorthy

Distributed matrix multiplication is widely used in several scientific domains. It is well recognized that computation times on distributed clusters are often dominated by the slowest workers (called stragglers). Recent work has demonstrated that straggler mitigation can be viewed as a problem of designing erasure codes. For matrices $\mathbf A$ and $\mathbf B$, the technique essentially maps the computation of $\mathbf A^T \mathbf B$ into the multiplication of smaller (coded) submatrices. The stragglers are treated as erasures in this process. The computation can be completed as long as a certain number of workers (called the recovery threshold) complete their assigned tasks.

We present a novel coding strategy for this problem when the absolute values of the matrix entries are sufficiently small. We demonstrate a tradeoff between the assumed absolute value bounds on the matrix entries and the recovery threshold. At one extreme, we are optimal with respect to the recovery threshold and on the other extreme, we match the threshold of prior work. Experimental results on cloud-based clusters validate the benefits of our method.


          TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents. (arXiv:1811.02146v1 [cs.CV])      Cache   Translate Page      

Authors: Yuexin Ma, Xinge Zhu, Sibo Zhang, Ruigang Yang, Wenping Wang, Dinesh Manocha

To safely and efficiently navigate in complex urban traffic, autonomous vehicles must make responsible predictions in relation to surrounding traffic-agents (vehicles, bicycles, pedestrians, etc.). A challenging and critical task is to explore the movement patterns of different traffic-agents and predict their future trajectories accurately to help the autonomous vehicle make reasonable navigation decision. To solve this problem, we propose a long short-term memory-based (LSTM-based) realtime traffic prediction algorithm, TrafficPredict. Our approach uses an instance layer to learn instances' movements and interactions and has a category layer to learn the similarities of instances belonging to the same type to refine the prediction. In order to evaluate its performance, we collected trajectory datasets in a large city consisting of varying conditions and traffic densities. The dataset includes many challenging scenarios where vehicles, bicycles, and pedestrians move among one another. We evaluate the performance of TrafficPredict on our new dataset and highlight its higher accuracy for trajectory prediction by comparing with prior prediction methods.


          A New Analysis for Support Recovery with Block Orthogonal Matching Pursuit. (arXiv:1811.02152v1 [cs.IT])      Cache   Translate Page      

Authors: Haifeng Li, Jinming Wen

Compressed Sensing (CS) is a signal processing technique which can accurately recover sparse signals from linear measurements with far fewer number of measurements than those required by the classical Shannon-Nyquist theorem. Block sparse signals, i.e., the sparse signals whose nonzero coefficients occur in few blocks, arise from many fields. Block orthogonal matching pursuit (BOMP) is a popular greedy algorithm for recovering block sparse signals due to its high efficiency and effectiveness. By fully using the block sparsity of block sparse signals, BOMP can achieve very good recovery performance. This paper proposes a sufficient condition to ensure that BOMP can exactly recover the support of block $K$-sparse signals under the noisy case. This condition is better than existing ones.


          FloWaveNet : A Generative Flow for Raw Audio. (arXiv:1811.02155v1 [cs.SD])      Cache   Translate Page      

Authors: Sungwon Kim, Sang-gil Lee, Jongyoon Song, Sungroh Yoon

Most of modern text-to-speech architectures use a WaveNet vocoder for synthesizing a high-fidelity waveform audio, but there has been a limitation for practical applications due to its slow autoregressive sampling scheme. A recently suggested Parallel WaveNet has achieved a real-time audio synthesis by incorporating Inverse Autogressive Flow (IAF) for parallel sampling. However, the Parallel WaveNet requires a two-stage training pipeline with a well-trained teacher network and is prone to mode collapsing if using a probability distillation training only. We propose FloWaveNet, a flow-based generative model for raw audio synthesis. FloWaveNet requires only a single maximum likelihood loss without any additional auxiliary terms and is inherently parallel due to the flow-based transformation. The model can efficiently sample the raw audio in real-time with a clarity comparable to the original WaveNet and ClariNet. Codes and samples for all models including our FloWaveNet is available via GitHub: https://github.com/ksw0306/FloWaveNet


          How Many Pairwise Preferences Do We Need to Rank A Graph Consistently?. (arXiv:1811.02161v1 [cs.LG])      Cache   Translate Page      

Authors: Aadirupa Saha, Rakesh Shivanna, Chiranjib Bhattacharyya

We consider the problem of optimal recovery of true ranking of $n$ items from a randomly chosen subset of their pairwise preferences. It is well known that without any further assumption, one requires a sample size of $\Omega(n^2)$ for the purpose. We analyze the problem with an additional structure of relational graph $G([n],E)$ over the $n$ items added with an assumption of \emph{locality}: Neighboring items are similar in their rankings. Noting the preferential nature of the data, we choose to embed not the graph, but, its \emph{strong product} to capture the pairwise node relationships. Furthermore, unlike existing literature that uses Laplacian embedding for graph based learning problems, we use a richer class of graph embeddings---\emph{orthonormal representations}---that includes (normalized) Laplacian as its special case. Our proposed algorithm, {\it Pref-Rank}, predicts the underlying ranking using an SVM based approach over the chosen embedding of the product graph, and is the first to provide \emph{statistical consistency} on two ranking losses: \emph{Kendall's tau} and \emph{Spearman's footrule}, with a required sample complexity of $O(n^2 \chi(\bar{G}))^{\frac{2}{3}}$ pairs, $\chi(\bar{G})$ being the \emph{chromatic number} of the complement graph $\bar{G}$. Clearly, our sample complexity is smaller for dense graphs, with $\chi(\bar G)$ characterizing the degree of node connectivity, which is also intuitive due to the locality assumption e.g. $O(n^\frac{4}{3})$ for union of $k$-cliques, or $O(n^\frac{5}{3})$ for random and power law graphs etc.---a quantity much smaller than the fundamental limit of $\Omega(n^2)$ for large $n$. This, for the first time, relates ranking complexity to structural properties of the graph. We also report experimental evaluations on different synthetic and real datasets, where our algorithm is shown to outperform the state-of-the-art methods.


          Language model integration based on memory control for sequence to sequence speech recognition. (arXiv:1811.02162v1 [eess.AS])      Cache   Translate Page      

Authors: Jaejin Cho, Shinji Watanabe, Takaaki Hori, Murali Karthick Baskar, Hirofumi Inaguma, Jesus Villalba, Najim Dehak

In this paper, we explore several new schemes to train a seq2seq model to integrate a pre-trained LM. Our proposed fusion methods focus on the memory cell state and the hidden state in the seq2seq decoder long short-term memory (LSTM), and the memory cell state is updated by the LM unlike the prior studies. This means the memory retained by the main seq2seq would be adjusted by the external LM. These fusion methods have several variants depending on the architecture of this memory cell update and the use of memory cell and hidden states which directly affects the final label inference. We performed the experiments to show the effectiveness of the proposed methods in a mono-lingual ASR setup on the Librispeech corpus and in a transfer learning setup from a multilingual ASR (MLASR) base model to a low-resourced language. In Librispeech, our best model improved WER by 3.7%, 2.4% for test clean, test other relatively to the shallow fusion baseline, with multi-level decoding. In transfer learning from an MLASR base model to the IARPA Babel Swahili model, the best scheme improved the transferred model on eval set by 9.9%, 9.8% in CER, WER relatively to the 2-stage transfer baseline.


          "I had a solid theory before but it's falling apart": Polarizing Effects of Algorithmic Transparency. (arXiv:1811.02163v1 [cs.HC])      Cache   Translate Page      

Authors: Aaron Springer, Steve Whittaker

The rise of machine learning has brought closer scrutiny to intelligent systems, leading to calls for greater transparency and explainable algorithms. We explore the effects of transparency on user perceptions of a working intelligent system for emotion detection. In exploratory Study 1, we observed paradoxical effects of transparency which improves perceptions of system accuracy for some participants while reducing accuracy perceptions for others. In Study 2, we test this observation using mixed methods, showing that the apparent transparency paradox can be explained by a mismatch between participant expectations and system predictions. We qualitatively examine this process, indicating that transparency can undermine user confidence by causing users to fixate on flaws when they already have a model of system operation. In contrast transparency helps if users lack such a model. Finally, we revisit the notion of transparency and suggest design considerations for building safe and successful machine learning systems based on our insights.


          Progressive Disclosure: Designing for Effective Transparency. (arXiv:1811.02164v1 [cs.HC])      Cache   Translate Page      

Authors: Aaron Springer, Steve Whittaker

As we increasingly delegate important decisions to intelligent systems, it is essential that users understand how algorithmic decisions are made. Prior work has often taken a technocentric approach to transparency. In contrast, we explore empirical user-centric methods to better understand user reactions to transparent systems. We assess user reactions to global and incremental feedback in two studies. In Study 1, users anticipated that the more transparent incremental system would perform better, but retracted this evaluation after experience with the system. Qualitative data suggest this may arise because incremental feedback is distracting and undermines simple heuristics users form about system operation. Study 2 explored these effects in depth, suggesting that users may benefit from initially simplified feedback that hides potential system errors and assists users in building working heuristics about system operation. We use these findings to motivate new progressive disclosure principles for transparency in intelligent systems.


          A Novel Compressed Sensing Technique for Traffic Matrix Estimation of Software Defined Cloud Networks. (arXiv:1811.02165v1 [cs.NI])      Cache   Translate Page      

Authors: Sameer Qazi, Syed Muhammad Atif, Muhammad Bilal Kadri

Traffic Matrix estimation has always caught attention from researchers for better network management and future planning. With the advent of high traffic loads due to Cloud Computing platforms and Software Defined Networking based tunable routing and traffic management algorithms on the Internet, it is more necessary as ever to be able to predict current and future traffic volumes on the network. For large networks such origin-destination traffic prediction problem takes the form of a large under-constrained and under-determined system of equations with a dynamic measurement matrix. In this work, we present our Compressed Sensing with Dynamic Model Estimation (CS-DME) architecture suitable for modern software defined networks. Our main contributions are: (1) we formulate an approach in which measurement matrix in the compressed sensing scheme can be accurately and dynamically estimated through a reformulation of the problem based on traffic demands. (2) We show that the problem formulation using a dynamic measurement matrix based on instantaneous traffic demands may be used instead of a stationary binary routing matrix which is more suitable to modern Software Defined Networks that are constantly evolving in terms of routing by inspection of its Eigen Spectrum using two real world datasets. (3) We also show that linking this compressed measurement matrix dynamically with the measured parameters can lead to acceptable estimation of Origin Destination (OD) Traffic flows with marginally poor results with other state-of-art schemes relying on fixed measurement matrices. (4) Furthermore, using this compressed reformulated problem, a new strategy for selection of vantage points for most efficient traffic matrix estimation is also presented through a secondary compression technique based on subset of link measurements.


          DIAG-NRE: A Deep Pattern Diagnosis Framework for Distant Supervision Neural Relation Extraction. (arXiv:1811.02166v1 [cs.CL])      Cache   Translate Page      

Authors: Shun Zheng, Peilin Yu, Lu Chen, Ling Huang, Wei Xu

Modern neural network models have achieved the state-of-the-art performance on relation extraction (RE) tasks. Although distant supervision (DS) can automatically generate training labels for RE, the effectiveness of DS highly depends on datasets and relation types, and sometimes it may introduce large labeling noises. In this paper, we propose a deep pattern diagnosis framework, DIAG-NRE, that aims to diagnose and improve neural relation extraction (NRE) models trained on DS-generated data. DIAG-NRE includes three stages: (1) The deep pattern extraction stage employs reinforcement learning to extract regular-expression-style patterns from NRE models. (2) The pattern refinement stage builds a pattern hierarchy to find the most representative patterns and lets human reviewers evaluate them quantitatively by annotating a certain number of pattern-matched examples. In this way, we minimize both the number of labels to annotate and the difficulty of writing heuristic patterns. (3) The weak label fusion stage fuses multiple weak label sources, including DS and refined patterns, to produce noise-reduced labels that can train a better NRE model. To demonstrate the broad applicability of DIAG-NRE, we use it to diagnose 14 relation types of two public datasets with one simple hyper-parameter configuration. We observe different noise behaviors and obtain significant F1 improvements on all relation types suffering from large labeling noises.


          Knuth's Moves on Timed Words. (arXiv:1811.02169v1 [math.CO])      Cache   Translate Page      

Authors: Amritanshu Prasad

We give an exposition of Schensted's algorithm to find the length of the longest increasing subword of a word in an ordered alphabet, and Greene's generalization of Schensted's results using Knuth equivalence. We announce a generalization of these results to timed words.


          Comments Regarding `On the Identifiability of the Influence Model for Stochastic Spatiotemporal Spread Processes'. (arXiv:1811.02171v1 [cs.SY])      Cache   Translate Page      

Authors: Sandip Roy

The identifiability analysis of a networked Markov chain model known as the influence model, as described in a recent contribution to Arxiv, is examined. Two errors in the identifiability analysis -- one related to the unidentifiability of the partially-observed influence model, the second related to an omission of an additional recurrence criterion for identifiability -- are noted. In addition, some concerns about the formulation of the identifiability problem and the proposed estimation approach are noted.


          Neural Phrase-to-Phrase Machine Translation. (arXiv:1811.02172v1 [cs.CL])      Cache   Translate Page      

Authors: Jiangtao Feng, Lingpeng Kong, Po-Sen Huang, Chong Wang, Da Huang, Jiayuan Mao, Kan Qiao, Dengyong Zhou

In this paper, we propose Neural Phrase-to-Phrase Machine Translation (NP$^2$MT). Our model uses a phrase attention mechanism to discover relevant input (source) segments that are used by a decoder to generate output (target) phrases. We also design an efficient dynamic programming algorithm to decode segments that allows the model to be trained faster than the existing neural phrase-based machine translation method by Huang et al. (2018). Furthermore, our method can naturally integrate with external phrase dictionaries during decoding. Empirical experiments show that our method achieves comparable performance with the state-of-the art methods on benchmark datasets. However, when the training and testing data are from different distributions or domains, our method performs better.


          The entropy of lies: playing twenty questions with a liar. (arXiv:1811.02177v1 [cs.DS])      Cache   Translate Page      

Authors: Yuval Dagan, Yuval Filmus, Daniel Kane, Shay Moran

`Twenty questions' is a guessing game played by two players: Bob thinks of an integer between $1$ and $n$, and Alice's goal is to recover it using a minimal number of Yes/No questions. Shannon's entropy has a natural interpretation in this context. It characterizes the average number of questions used by an optimal strategy in the distributional variant of the game: let $\mu$ be a distribution over $[n]$, then the average number of questions used by an optimal strategy that recovers $x\sim \mu$ is between $H(\mu)$ and $H(\mu)+1$. We consider an extension of this game where at most $k$ questions can be answered falsely. We extend the classical result by showing that an optimal strategy uses roughly $H(\mu) + k H_2(\mu)$ questions, where $H_2(\mu) = \sum_x \mu(x)\log\log\frac{1}{\mu(x)}$. This also generalizes a result by Rivest et al. for the uniform distribution. Moreover, we design near optimal strategies that only use comparison queries of the form `$x \leq c$?' for $c\in[n]$. The usage of comparison queries lends itself naturally to the context of sorting, where we derive sorting algorithms in the presence of adversarial noise.


          Fast OBDD Reordering using Neural Message Passing on Hypergraph. (arXiv:1811.02178v1 [cs.AI])      Cache   Translate Page      

Authors: Feifan Xu, Fei He, Enze Xie, Liang Li

Ordered binary decision diagrams (OBDDs) are an efficient data structure for representing and manipulating Boolean formulas. With respect to different variable orders, the OBDDs' sizes may vary from linear to exponential in the number of the Boolean variables. Finding the optimal variable order has been proved a NP-complete problem. Many heuristics have been proposed to find a near-optimal solution of this problem. In this paper, we propose a neural network-based method to predict near-optimal variable orders for unknown formulas. Viewing these formulas as hypergraphs, and lifting the message passing neural network into 3-hypergraph (MPNN3), we are able to learn the patterns of Boolean formula. Compared to the traditional methods, our method can find a near-the-best solution with an extremely shorter time, even for some hard examples.To the best of our knowledge, this is the first work on applying neural network to OBDD reordering.


          Unpaired Speech Enhancement by Acoustic and Adversarial Supervision for Speech Recognition. (arXiv:1811.02182v1 [cs.CL])      Cache   Translate Page      

Authors: Geonmin Kim, Hwaran Lee, Bo-Kyeong Kim, Sang-Hoon Oh, Soo-Young Lee

Many speech enhancement methods try to learn the relationship between noisy and clean speech, obtained using an acoustic room simulator. We point out several limitations of enhancement methods relying on clean speech targets; the goal of this work is proposing an alternative learning algorithm, called acoustic and adversarial supervision (AAS). AAS makes the enhanced output both maximizing the likelihood of transcription on the pre-trained acoustic model and having general characteristics of clean speech, which improve generalization on unseen noisy speeches. We employ the connectionist temporal classification and the unpaired conditional boundary equilibrium generative adversarial network as the loss function of AAS. AAS is tested on two datasets including additive noise without and with reverberation, Librispeech + DEMAND and CHiME-4. By visualizing the enhanced speech with different loss combinations, we demonstrate the role of each supervision. AAS achieves a lower word error rate than other state-of-the-art methods using the clean speech target in both datasets.


          A Dynamic Regret Analysis and Adaptive Regularization Algorithm for On-Policy Robot Imitation Learning. (arXiv:1811.02184v1 [cs.RO])      Cache   Translate Page      

Authors: Jonathan Lee, Michael Laskey, Ajay Kumar Tanwani, Anil Aswani, Ken Goldberg

On-policy imitation learning algorithms such as Dagger evolve a robot control policy by executing it, measuring performance (loss), obtaining corrective feedback from a supervisor, and generating the next policy. As the loss between iterations can vary unpredictably, a fundamental question is under what conditions this process will eventually achieve a converged policy. If one assumes the underlying trajectory distribution is static (stationary), it is possible to prove convergence for Dagger. Cheng and Boots (2018) consider the more realistic model for robotics where the underlying trajectory distribution, which is a function of the policy, is dynamic and show that it is possible to prove convergence when a condition on the rate of change of the trajectory distributions is satisfied. In this paper, we reframe that result using dynamic regret theory from the field of Online Optimization to prove convergence to locally optimal policies for Dagger, Imitation Gradient, and Multiple Imitation Gradient. These results inspire a new algorithm, Adaptive On-Policy Regularization (AOR), that ensures the conditions for convergence. We present simulation results with cart-pole balancing and walker locomotion benchmarks that suggest AOR can significantly decrease dynamic regret and chattering. To our knowledge, this the first application of dynamic regret theory to imitation learning.


          Neural Network-Hardware Co-design for Scalable RRAM-based BNN Accelerators. (arXiv:1811.02187v1 [cs.NE])      Cache   Translate Page      

Authors: Yulhwa Kim, Hyungjun Kim, Jae-Joon Kim

Recently, RRAM-based Binary Neural Network (BNN) hardware has been gaining interests as it requires 1-bit sense-amp only and eliminates the need for high-resolution ADC and DAC. However, RRAM-based BNN hardware still requires high-resolution ADC for partial sum calculation to implement large-scale neural network using multiple memory arrays. We propose a neural network-hardware co-design approach to split input to fit each split network on a RRAM array so that the reconstructed BNNs calculate 1-bit output neuron in each array. As a result, ADC can be completely eliminated from the design even for large-scale neural network. Simulation results show that the proposed network reconstruction and retraining recovers the inference accuracy of the original BNN. The accuracy loss of the proposed scheme in the CIFAR-10 testcase was less than 1.1% compared to the original network.


          Adaptive Stress Testing: Finding Failure Events with Reinforcement Learning. (arXiv:1811.02188v1 [cs.AI])      Cache   Translate Page      

Authors: Ritchie Lee, Ole J. Mengshoel, Anshu Saksena, Ryan Gardner, Daniel Genin, Joshua Silbermann, Michael Owen, Mykel J. Kochenderfer

Finding the most likely path to a set of failure states is important to the analysis of safety-critical dynamic systems. While efficient solutions exist for certain classes of systems, a scalable general solution for stochastic, partially-observable, and continuous-valued systems remains challenging. Existing approaches in formal and simulation-based methods either cannot scale to large systems or are computationally inefficient. This paper presents adaptive stress testing (AST), a framework for searching a simulator for the most likely path to a failure event. We formulate the problem as a Markov decision process and use reinforcement learning to optimize it. The approach is simulation-based and does not require internal knowledge of the system. As a result, the approach is very suitable for black box testing of large systems. We present formulations for both systems where the state is fully-observable and partially-observable. In the latter case, we present a modified Monte Carlo tree search algorithm that only requires access to the pseudorandom number generator of the simulator to overcome partial observability. We also present an extension of the framework, called differential adaptive stress testing (DAST), that can be used to find failures that occur in one system but not in another. This type of differential analysis is useful in applications such as regression testing, where one is concerned with finding areas of relative weakness compared to a baseline. We demonstrate the effectiveness of the approach on an aircraft collision avoidance application, where we stress test a prototype aircraft collision avoidance system to find high-probability scenarios of near mid-air collisions.


          BLP - Boundary Likelihood Pinpointing Networks for Accurate Temporal Action Localization. (arXiv:1811.02189v1 [cs.CV])      Cache   Translate Page      

Authors: Weijie Kong, Nannan Li, Shan Liu, Thomas Li, Ge Li

Despite tremendous progress achieved in temporal action detection, state-of-the-art methods still suffer from the sharp performance deterioration when localizing the starting and ending temporal action boundaries. Although most methods apply boundary regression paradigm to tackle this problem, we argue that the direct regression lacks detailed enough information to yield accurate temporal boundaries. In this paper, we propose a novel Boundary Likelihood Pinpointing (BLP) network to alleviate this deficiency of boundary regression and improve the localization accuracy. Given a loosely localized search interval that contains an action instance, BLP casts the problem of localizing temporal boundaries as that of assigning probabilities on each equally divided unit of this interval. These generated probabilities provide useful information regarding the boundary location of the action inside this search interval. Based on these probabilities, we introduce a boundary pinpointing paradigm to pinpoint the accurate boundaries under a simple probabilistic framework. Compared with other C3D feature based detectors, extensively experiments demonstrate that BLP significantly improve the localization performance of recent state-of-the-art detectors, and achieve competitive detection mAP on both THUMOS' 14 and ActivityNet datasets, particularly when the evaluation tIoU is high.


          3DCapsule: Extending the Capsule Architecture to Classify 3D Point Clouds. (arXiv:1811.02191v1 [cs.CV])      Cache   Translate Page      

Authors: Ali Cheraghian, Lars Petersson

This paper introduces the 3DCapsule, which is a 3D extension of the recently introduced Capsule concept that makes it applicable to unordered point sets. The original Capsule relies on the existence of a spatial relationship between the elements in the feature map it is presented with, whereas in point permutation invariant formulations of 3D point set classification methods, such relationships are typically lost. Here, a new layer called ComposeCaps is introduced that, in lieu of a spatially relevant feature mapping, learns a new mapping that can be exploited by the 3DCapsule. Previous works in the 3D point set classification domain have focused on other parts of the architecture, whereas instead, the 3DCapsule is a drop-in replacement of the commonly used fully connected classifier. It is demonstrated via an ablation study, that when the 3DCapsule is applied to recent 3D point set classification architectures, it consistently shows an improvement, in particular when subjected to noisy data. Similarly, the ComposeCaps layer is evaluated and demonstrates an improvement over the baseline. In an apples-to-apples comparison against state-of-the-art methods, again, better performance is demonstrated by the 3DCapsule.


          In-the-wild Facial Expression Recognition in Extreme Poses. (arXiv:1811.02194v1 [cs.CV])      Cache   Translate Page      

Authors: Fei Yang, Qian Zhang, Chi Zheng, Guoping Qiu

In the computer research area, facial expression recognition is a hot research problem. Recent years, the research has moved from the lab environment to in-the-wild circumstances. It is challenging, especially under extreme poses. But current expression detection systems are trying to avoid the pose effects and gain the general applicable ability. In this work, we solve the problem in the opposite approach. We consider the head poses and detect the expressions within special head poses. Our work includes two parts: detect the head pose and group it into one pre-defined head pose class; do facial expression recognize within each pose class. Our experiments show that the recognition results with pose class grouping are much better than that of direct recognition without considering poses. We combine the hand-crafted features, SIFT, LBP and geometric feature, with deep learning feature as the representation of the expressions. The handcrafted features are added into the deep learning framework along with the high level deep learning features. As a comparison, we implement SVM and random forest to as the prediction models. To train and test our methodology, we labeled the face dataset with 6 basic expressions.


          Credit Card Fraud Detection in e-Commerce: An Outlier Detection Approach. (arXiv:1811.02196v1 [cs.LG])      Cache   Translate Page      

Authors: Utkarsh Porwal, Smruthi Mukund

Often the challenge associated with tasks like fraud and spam detection is the lack of all likely patterns needed to train suitable supervised learning models. This problem accentuates when the fraudulent patterns are not only scarce, they also change over time. Change in fraudulent pattern is because fraudsters continue to innovate novel ways to circumvent measures put in place to prevent fraud. Limited data and continuously changing patterns makes learning significantly difficult. We hypothesize that good behavior does not change with time and data points representing good behavior have consistent spatial signature under different groupings. Based on this hypothesis we are proposing an approach that detects outliers in large data sets by assigning a consistency score to each data point using an ensemble of clustering methods. Our main contribution is proposing a novel method that can detect outliers in large datasets and is robust to changing patterns. We also argue that area under the ROC curve, although a commonly used metric to evaluate outlier detection methods is not the right metric. Since outlier detection problems have a skewed distribution of classes, precision-recall curves are better suited because precision compares false positives to true positives (outliers) rather than true negatives (inliers) and therefore is not affected by the problem of class imbalance. We show empirically that area under the precision-recall curve is a better than ROC as an evaluation metric. The proposed approach is tested on the modified version of the Landsat satellite dataset, the modified version of the ann-thyroid dataset and a large real world credit card fraud detection dataset available through Kaggle where we show significant improvement over the baseline methods.


          Collaborative Filtering with Stability. (arXiv:1811.02198v1 [cs.LG])      Cache   Translate Page      

Authors: Dongsheng Li, Chao Chen, Qin Lv, Junchi Yan, Li Shang, Stephen M. Chu

Collaborative filtering (CF) is a popular technique in today's recommender systems, and matrix approximation-based CF methods have achieved great success in both rating prediction and top-N recommendation tasks. However, real-world user-item rating matrices are typically sparse, incomplete and noisy, which introduce challenges to the algorithm stability of matrix approximation, i.e., small changes in the training data may significantly change the models. As a result, existing matrix approximation solutions yield low generalization performance, exhibiting high error variance on the training data, and minimizing the training error may not guarantee error reduction on the test data. This paper investigates the algorithm stability problem of matrix approximation methods and how to achieve stable collaborative filtering via stable matrix approximation. We present a new algorithm design framework, which (1) introduces new optimization objectives to guide stable matrix approximation algorithm design, and (2) solves the optimization problem to obtain stable approximation solutions with good generalization performance. Experimental results on real-world datasets demonstrate that the proposed method can achieve better accuracy compared with state-of-the-art matrix approximation methods and ensemble methods in both rating prediction and top-N recommendation tasks.


          On-the-fly Large-scale Channel-Gain Estimation for Massive Antenna-Array Base Stations. (arXiv:1811.02202v1 [cs.IT])      Cache   Translate Page      

Authors: Chenwei Wang, Ozgun Y. Bursalioglu, Haralabos Papadopoulos, Giuseppe Caire

We propose a novel scheme for estimating the large-scale gains of the channels between user terminals (UTs) and base stations (BSs) in a cellular system. The scheme leverages TDD operation, uplink (UL) training by means of properly designed non-orthogonal pilot codes, and massive antenna arrays at the BSs. Subject to Q resource elements allocated for UL training and using the new scheme, a BS is able to estimate the large-scale channel gains of K users transmitting UL pilots in its cell and in nearby cells, provided K<=Q^2. Such knowledge of the large-scale channel gains of nearby out-of-cells users can be exploited at the BS to mitigate interference to the out-of-cell users that experience the highest levels of interference from the BS. We investigate the large-scale gain estimation performance provided by a variety of non-orthogonal pilot codebook designs. Our simulations suggest that among all the code designs considered, Grassmannian line-packing type codes yield the best large-scale channel gain estimation performance.


          On-the-fly Uplink Training and Pilot Code Sequence Design for Cellular Networks. (arXiv:1811.02203v1 [cs.IT])      Cache   Translate Page      

Authors: Zekun Zhang, Chenwei Wang, Haralabos Papadopoulos

Cellular networks of massive MIMO base-stations employing TDD/OFDM and relying on uplink training for both downlink and uplink transmission are viewed as an attractive candidate for 5G deployments, as they promise high area spectral and energy efficiencies with relatively simple low-latency operation. We investigate the use of non-orthogonal uplink pilot designs as a means for improving the area spectral efficiency in the downlink of such massive MIMO cellular networks. We develop a class of pilot designs that are locally orthogonal within each cell, while maintaining low inner-product properties between codes in different cells. Using channel estimates provided by observations on these codes, each cell independently serves its locally active users with MU-MIMO transmission that is also designed to mitigate interference to a subset of `strongly interfered' out-of-cell users. As our simulation-based analysis shows, such cellular operation based on the proposed codes yields user-rate CDF improvement with respect to conventional operation, which can be exploited to improve cell and/or cell-throughput performance.


          DSNet: Deep and Shallow Feature Learning for Efficient Visual Tracking. (arXiv:1811.02208v1 [cs.CV])      Cache   Translate Page      

Authors: Qiangqiang Wu, Yan Yan, Yanjie Liang, Yi Liu, Hanzi Wang

In recent years, Discriminative Correlation Filter (DCF) based tracking methods have achieved great success in visual tracking. However, the multi-resolution convolutional feature maps trained from other tasks like image classification, cannot be naturally used in the conventional DCF formulation. Furthermore, these high-dimensional feature maps significantly increase the tracking complexity and thus limit the tracking speed. In this paper, we present a deep and shallow feature learning network, namely DSNet, to learn the multi-level same-resolution compressed (MSC) features for efficient online tracking, in an end-to-end offline manner. Specifically, the proposed DSNet compresses multi-level convolutional features to uniform spatial resolution features. The learned MSC features effectively encode both appearance and semantic information of objects in the same-resolution feature maps, thus enabling an elegant combination of the MSC features with any DCF-based methods. Additionally, a channel reliability measurement (CRM) method is presented to further refine the learned MSC features. We demonstrate the effectiveness of the MSC features learned from the proposed DSNet on two DCF tracking frameworks: the basic DCF framework and the continuous convolution operator framework. Extensive experiments show that the learned MSC features have the appealing advantage of allowing the equipped DCF-based tracking methods to perform favorably against the state-of-the-art methods while running at high frame rates.


          Better Late Than Never: A Fully Abstract Semantics for Classical Processes. (arXiv:1811.02209v1 [cs.LO])      Cache   Translate Page      

Authors: Wen Kokke, Fabrizio Montesi, Marco Peressotti

We present Hypersequent Classical Processes (HCP), a revised interpretation of the "Proofs as Processes" correspondence between linear logic and the {\pi}-calculus initially proposed by Abramsky [1994], and later developed by Bellin and Scott [1994], Caires and Pfenning [2010], and Wadler [2014], among others. HCP mends the discrepancies between linear logic and the syntax and observable semantics of parallel composition in the {\pi}-calculus, by conservatively extending linear logic to hyperenvironments (collections of environments, inspired by the hypersequents by Avron [1991]). Separation of environments in hyperenvironments is internalised by $\otimes$ and corresponds to parallel process behaviour. Thanks to this property, for the first time we are able to extract a labelled transition system (lts) semantics from proof rewritings. Leveraging the information on parallelism at the level of types, we obtain a logical reconstruction of the delayed actions that Merro and Sangiorgi [2004] formulated to model non-blocking I/O in the {\pi}-calculus. We define a denotational semantics for processes based on Brzozowski derivatives, and uncover that non-interference in HCP corresponds to Fubini's theorem of double antiderivation. Having an lts allows us to validate HCP using the standard toolbox of behavioural theory. We instantiate bisimilarity and barbed congruence for HCP, and obtain a full abstraction result: bisimilarity, denotational equivalence, and barbed congruence coincide.


          Hybrid Approach to Automation, RPA and Machine Learning: a Method for the Human-centered Design of Software Robots. (arXiv:1811.02213v1 [cs.SE])      Cache   Translate Page      

Authors: Wiesław Kopeć, Marcin Skibiński, Cezary Biele, Kinga Skorupska, Dominika Tkaczyk, Anna Jaskulska, Katarzyna Abramczuk, Piotr Gago, Krzysztof Marasek

One of the more prominent trends within Industry 4.0 is the drive to employ Robotic Process Automation (RPA), especially as one of the elements of the Lean approach. The full implementation of RPA is riddled with challenges relating both to the reality of everyday business operations, from SMEs to SSCs and beyond, and the social effects of the changing job market. To successfully address these points there is a need to develop a solution that would adjust to the existing business operations and at the same time lower the negative social impact of the automation process.

To achieve these goals we propose a hybrid, human-centered approach to the development of software robots. This design and implementation method combines the Living Lab approach with empowerment through participatory design to kick-start the co-development and co-maintenance of hybrid software robots which, supported by variety of AI methods and tools, including interactive and collaborative ML in the cloud, transform menial job posts into higher-skilled positions, allowing former employees to stay on as robot co-designers and maintainers, i.e. as co-programmers who supervise the machine learning processes with the use of tailored high-level RPA Domain Specific Languages (DSLs) to adjust the functioning of the robots and maintain operational flexibility.


          Day-ahead time series forecasting: application to capacity planning. (arXiv:1811.02215v1 [cs.AI])      Cache   Translate Page      

Authors: Colin Leverger (LACODAM), Vincent Lemaire, Simon Malinowski (UR1, LinkMedia), Thomas Guyet (LACODAM), Laurence Rozé (LACODAM, INSA Rennes)

In the context of capacity planning, forecasting the evolution of informatics servers usage enables companies to better manage their computational resources. We address this problem by collecting key indicator time series and propose to forecast their evolution a day-ahead. Our method assumes that data is structured by a daily seasonality, but also that there is typical evolution of indicators within a day. Then, it uses the combination of a clustering algorithm and Markov Models to produce day-ahead forecasts. Our experiments on real datasets show that the data satisfies our assumption and that, in the case study, our method outperforms classical approaches (AR, Holt-Winters).


          An Optimal Itinerary Generation in a Configuration Space of Large Intellectual Agent Groups with Linear Logic. (arXiv:1811.02216v1 [cs.AI])      Cache   Translate Page      

Authors: Dmitry Maximov

A group of intelligent agents which fulfill a set of tasks in parallel is represented first by the tensor multiplication of corresponding processes in a linear logic game category. An optimal itinerary in the configuration space of the group states is defined as a play with maximal total reward in the category. New moments also are: the reward is represented as a degree of certainty (visibility) of an agent goal, and the system goals are chosen by the greatest value corresponding to these processes in the system goal lattice.


          A Scalable Algorithm for Privacy-Preserving Item-based Top-N Recommendation. (arXiv:1811.02217v1 [cs.CR])      Cache   Translate Page      

Authors: Yingying Zhao, Dongsheng Li, Qin Lv, Li Shang

Recommender systems have become an indispensable component in online services during recent years. Effective recommendation is essential for improving the services of various online business applications. However, serious privacy concerns have been raised on recommender systems requiring the collection of users' private information for recommendation. At the same time, the success of e-commerce has generated massive amounts of information, making scalability a key challenge in the design of recommender systems. As such, it is desirable for recommender systems to protect users' privacy while achieving high-quality recommendations with low-complexity computations.

This paper proposes a scalable privacy-preserving item-based top-N recommendation solution, which can achieve high-quality recommendations with reduced computation complexity while ensuring that users' private information is protected. Furthermore, the computation complexity of the proposed method increases slowly as the number of users increases, thus providing high scalability for privacy-preserving recommender systems. More specifically, the proposed approach consists of two key components: (1) MinHash-based similarity estimation and (2) client-side privacy-preserving prediction generation. Our theoretical and experimental analysis using real-world data demonstrates the efficiency and effectiveness of the proposed approach.


          CarePre: An Intelligent Clinical Decision Assistance System. (arXiv:1811.02218v1 [cs.HC])      Cache   Translate Page      

Authors: Zhuochen Jin, Jingshun Yang, Shuyuan Cui, David Gotz, Jimeng Sun, Nan Cao

Clinical decision support systems (CDSS) are widely used to assist with medical decision making. However, CDSS typically require manually curated rules and other data which are difficult to maintain and keep up-to-date. Recent systems leverage advanced deep learning techniques and electronic health records (EHR) to provide more timely and precise results. Many of these techniques have been developed with a common focus on predicting upcoming medical events. However, while the prediction results from these approaches are promising, their value is limited by their lack of interpretability. To address this challenge, we introduce CarePre, an intelligent clinical decision assistance system. The system extends a state-of-the-art deep learning model to predict upcoming diagnosis events for a focal patient based on his/her historical medical records. The system includes an interactive framework together with intuitive visualizations designed to support the diagnosis, treatment outcome analysis, and the interpretation of the analysis results. We demonstrate the effectiveness and usefulness of CarePre system by reporting results from a quantities evaluation of the prediction algorithm and a case study and three interviews with senior physicians.


          A Quasi-Newton algorithm on the orthogonal manifold for NMF with transform learning. (arXiv:1811.02225v1 [stat.ML])      Cache   Translate Page      

Authors: Pierre Ablin (PARIETAL), Dylan Fagot (IRIT), Herwig Wendt (IRIT), Alexandre Gramfort (PARIETAL), Cédric Févotte (IRIT)

Nonnegative matrix factorization (NMF) is a popular method for audio spectral unmixing. While NMF is traditionally applied to off-the-shelf time-frequency representations based on the short-time Fourier or Cosine transforms, the ability to learn transforms from raw data attracts increasing attention. However, this adds an important computational overhead. When assumed orthogonal (like the Fourier or Cosine transforms), learning the transform yields a non-convex optimization problem on the orthogonal matrix manifold. In this paper, we derive a quasi-Newton method on the manifold using sparse approximations of the Hessian. Experiments on synthetic and real audio data show that the proposed algorithm out-performs state-of-the-art first-order and coordinate-descent methods by orders of magnitude. A Python package for fast TL-NMF is released online at https://github.com/pierreablin/tlnmf.


          Kernel Exponential Family Estimation via Doubly Dual Embedding. (arXiv:1811.02228v1 [cs.LG])      Cache   Translate Page      

Authors: Bo Dai, Hanjun Dai, Arthur Gretton, Le Song, Dale Schuurmans, Niao He

We investigate penalized maximum log-likelihood estimation for exponential family distributions whose natural parameter resides in a reproducing kernel Hilbert space. Key to our approach is a novel technique, doubly dual embedding, that avoids computation of the partition function. This technique also allows the development of a flexible sampling strategy that amortizes the cost of Monte-Carlo sampling in the inference stage. The resulting estimator can be easily generalized to kernel conditional exponential families. We furthermore establish a connection between infinite-dimensional exponential family estimation and MMD-GANs, revealing a new perspective for understanding GANs. Compared to current score matching based estimators, the proposed method improves both memory and time efficiency while enjoying stronger statistical properties, such as fully capturing smoothness in its statistical convergence rate while the score matching estimator appears to saturate. Finally, we show that the proposed estimator can empirically outperform state-of-the-art methods in both kernel exponential family estimation and its conditional extension.


          CIS at TAC Cold Start 2015: Neural Networks and Coreference Resolution for Slot Filling. (arXiv:1811.02230v1 [cs.CL])      Cache   Translate Page      

Authors: Heike Adel, Hinrich Schütze

This paper describes the CIS slot filling system for the TAC Cold Start evaluations 2015. It extends and improves the system we have built for the evaluation last year. This paper mainly describes the changes to our last year's system. Especially, it focuses on the coreference and classification component. For coreference, we have performed several analysis and prepared a resource to simplify our end-to-end system and improve its runtime. For classification, we propose to use neural networks. We have trained convolutional and recurrent neural networks and combined them with traditional evaluation methods, namely patterns and support vector machines. Our runs for the 2015 evaluation have been designed to directly assess the effect of each network on the end-to-end performance of the system. The CIS system achieved rank 3 of all slot filling systems participating in the task.


          Weakly Supervised Scene Parsing with Point-based Distance Metric Learning. (arXiv:1811.02233v1 [cs.CV])      Cache   Translate Page      

Authors: Rui Qian, Yunchao Wei, Honghui Shi, Jiachen Li, Jiaying Liu, Thomas Huang

Semantic scene parsing is suffering from the fact that pixel-level annotations are hard to be collected. To tackle this issue, we propose a Point-based Distance Metric Learning (PDML) in this paper. PDML does not require dense annotated masks and only leverages several labeled points that are much easier to obtain to guide the training process. Concretely, we leverage semantic relationship among the annotated points by encouraging the feature representations of the intra- and inter-category points to keep consistent, i.e. points within the same category should have more similar feature representations compared to those from different categories. We formulate such a characteristic into a simple distance metric loss, which collaborates with the point-wise cross-entropy loss to optimize the deep neural networks. Furthermore, to fully exploit the limited annotations, distance metric learning is conducted across different training images instead of simply adopting an image-dependent manner. We conduct extensive experiments on two challenging scene parsing benchmarks of PASCAL-Context and ADE 20K to validate the effectiveness of our PDML, and competitive mIoU scores are achieved.


          Semantic bottleneck for computer vision tasks. (arXiv:1811.02234v1 [cs.CV])      Cache   Translate Page      

Authors: Maxime Bucher (Palaiseau), Stéphane Herbin (Palaiseau), Frédéric Jurie

This paper introduces a novel method for the representation of images that is semantic by nature, addressing the question of computation intelligibility in computer vision tasks. More specifically, our proposition is to introduce what we call a semantic bottleneck in the processing pipeline, which is a crossing point in which the representation of the image is entirely expressed with natural language , while retaining the efficiency of numerical representations. We show that our approach is able to generate semantic representations that give state-of-the-art results on semantic content-based image retrieval and also perform very well on image classification tasks. Intelligibility is evaluated through user centered experiments for failure detection.


          SparseFool: a few pixels make a big difference. (arXiv:1811.02248v1 [cs.CV])      Cache   Translate Page      

Authors: Apostolos Modas, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard

Deep Neural Networks have achieved extraordinary results on image classification tasks, but have been shown to be vulnerable to attacks with carefully crafted perturbations of the input data. Although most attacks usually change values of many image's pixels, it has been shown that deep networks are also vulnerable to sparse alterations of the input. However, no \textit{efficient} method has been proposed to compute sparse perturbations. In this paper, we exploit the low mean curvature of the decision boundary, and propose SparseFool, a geometry inspired sparse attack that controls the sparsity of the perturbations. Extensive evaluations show that our approach outperforms related methods, and scales to high dimensional data. We further analyze the transferability and the visual effects of the perturbations, and show the existence of shared semantic information across the images and the networks. Finally, we show that adversarial training using $\ell_\infty$ perturbations can slightly improve the robustness against sparse additive perturbations.


          On the Resource Consumption of M2M Random Access: Efficiency and Pareto Optimality. (arXiv:1811.02249v1 [cs.IT])      Cache   Translate Page      

Authors: Mikhail Vilgelm, Sergio Rueda Linares, Wolfgang Kellerer

The advent of Machine-to-Machine communication has sparked a new wave of interest to random access protocols, especially in application to LTE Random Access (RA). By analogy with classical slotted ALOHA, state-of-the-art models LTE RA as a multi-channel slotted ALOHA. In this letter, we direct the attention to the resource consumption of RA. We show that the consumption is a random variable, dependent on the contention parameters. We consider two approaches to include the consumption into RA optimization: by defining resource efficiency and by the means of bi-objective optimization, where resource consumption and throughput are the competing objectives. We then develop the algorithm to obtain Pareto-optimal RA configuration under resource constraint. We show that the algorithm achieves lower burst resolution delay and higher throughput than state-of-the-art.


          De Bruijn graph and powers of $3/2$. (arXiv:1811.02254v1 [cs.FL])      Cache   Translate Page      

Authors: Oleksiy Kurganskyy, Igor Potapov

In this paper we consider the set ${\mathbb Z}^{\pm\omega}_{6}$ of two-way infinite words $\xi$ over the alphabet $\{0,1,2,3,4,5\}$ with the integer left part $\lfloor\xi\rfloor$ and the fractional right part $\{\xi\}$ separated by a radix point. For such words, the operation of multiplication by integers and division by $6$ are defined as the column multiplication and division in base 6 numerical system. The paper develops a finite automata approach for analysis of sequences $\left (\left \lfloor \xi \left (\frac{3}{2} \right)^n \right \rfloor \right)_{n \in {\mathbb Z}}$ for the words $\xi \in {\mathbb Z}^{\pm \omega}_{6}$ that have some common properties with $Z$-numbers in Mahler's $3/2$-problem. Such sequence of $Z$-words written under each other with the same digit positions in the same column is an infinite $2$-dimensional word over the alphabet ${\mathbb Z}_6$. The automata representation of the columns in the integer part of $2$-dimensional $Z$-words has the nice structural properties of the de Bruijn graphs. This way provides some sufficient conditions for the emptiness of the set of $Z$-numbers. Our approach has been initially inspirated by the proposition 2.5 in [1] where authors applies cellular automata for analysis of $\left(\left\{\xi\left(\frac{3}{2}\right)^n\right\} \right)_{n\in{\mathbb Z}}$, $\xi\in{\mathbb R}$.


          Characterizations and Directed Path-Width of Sequence Digraphs. (arXiv:1811.02259v1 [cs.DS])      Cache   Translate Page      

Authors: Frank Gurski, Carolin Rehs, Jochen Rethmann

Computing the directed path-width of a directed graph is an NP-hard problem. Even for digraphs of maximum semi-degree 3 the problem remains hard. We propose a decomposition of an input digraph G=(V,A) by a number k of sequences with entries from V, such that (u,v) in A if and only if in one of the sequences there is an occurrence of u appearing before an occurrence of v. We present several graph theoretical properties of these digraphs. Among these we give forbidden subdigraphs of digraphs which can be defined by k=1 sequence, which is a subclass of semicomplete digraphs. Given the decomposition of digraph G, we show an algorithm which computes the directed path-width of G in time O(k\cdot (1+N)^k), where N denotes the maximum sequence length. This leads to an XP-algorithm w.r.t. k for the directed path-width problem. Our result improves the algorithms of Kitsunai et al. for digraphs of large directed path-width which can be decomposed by a small number of sequence.


          NIPS4Bplus: a richly annotated birdsong audio dataset. (arXiv:1811.02275v1 [cs.SD])      Cache   Translate Page      

Authors: Veronica Morfi, Yves Bas, Hanna Pamuła, Hervé Glotin, Dan Stowell

Recent advances in birdsong detection and classification have approached a limit due to the lack of fully annotated recordings. In this paper, we present NIPS4Bplus, the first richly annotated birdsong audio dataset, that is comprised of recordings containing bird vocalisations along with their active species tags plus the temporal annotations acquired for them. Statistical information about the recordings, their species specific tags and their temporal annotations are presented along with example uses. NIPS4Bplus could be used in various ecoacoustic tasks, such as training models for bird population monitoring, species classification, birdsong vocalisation detection and classification.


          Blockchain based Proxy Re-Encryption Scheme for Secure IoT Data Sharing. (arXiv:1811.02276v1 [cs.CR])      Cache   Translate Page      

Authors: Ahsan Manzoor, Madhsanka Liyanage, An Braeken, Salil S. Kanhere, Mika Ylianttila

Data is central to the Internet of Things (IoT) ecosystem. Most of the current IoT systems are using centralized cloud-based data sharing systems, which will be difficult to scale up to meet the demands of future IoT systems. Involvement of such third-party service provider requires also trust from both sensor owner and sensor data user. Moreover, the fees need to be paid for their services. To tackle both the scalability and trust issues and to automatize the payments, this paper presents a blockchain based proxy re-encryption scheme. The system stores the IoT data in a distributed cloud after encryption. To share the collected IoT data, the system establishes runtime dynamic smart contracts between the sensor and data user without the involvement of a trusted third party. It also uses a very efficient proxy re-encryption scheme which allows that the data is only visible by the owner and the person present in the smart contract. This novel combination of smart contracts with proxy re-encryption provides an efficient, fast and secure platform for storing, trading and managing of sensor data. The proposed system is implemented in an Ethereum based testbed to analyze the performance and the security properties.


          Off-the-Shelf Unsupervised NMT. (arXiv:1811.02278v1 [cs.CL])      Cache   Translate Page      

Authors: Chris Hokamp, Sebastian Ruder, John Glover

We frame unsupervised machine translation (MT) in the context of multi-task learning (MTL), combining insights from both directions. We leverage off-the-shelf neural MT architectures to train unsupervised MT models with no parallel data and show that such models can achieve reasonably good performance, competitive with models purpose-built for unsupervised MT. Finally, we propose improvements that allow us to apply our models to English-Turkish, a truly low-resource language pair.


          Comparison of Discrete Choice Models and Artificial Neural Networks in Presence of Missing Variables. (arXiv:1811.02284v1 [stat.ML])      Cache   Translate Page      

Authors: Johan Barthélemy, Morgane Dumont, Timoteo Carletti

Classification, the process of assigning a label (or class) to an observation given its features, is a common task in many applications. Nonetheless in most real-life applications, the labels can not be fully explained by the observed features. Indeed there can be many factors hidden to the modellers. The unexplained variation is then treated as some random noise which is handled differently depending on the method retained by the practitioner. This work focuses on two simple and widely used supervised classification algorithms: discrete choice models and artificial neural networks in the context of binary classification.

Through various numerical experiments involving continuous or discrete explanatory features, we present a comparison of the retained methods' performance in presence of missing variables. The impact of the distribution of the two classes in the training data is also investigated. The outcomes of those experiments highlight the fact that artificial neural networks outperforms the discrete choice models, except when the distribution of the classes in the training data is highly unbalanced.

Finally, this work provides some guidelines for choosing the right classifier with respect to the training data.


          Defining Big Data Analytics Benchmarks for Next Generation Supercomputers. (arXiv:1811.02287v1 [cs.PF])      Cache   Translate Page      

Authors: Drew Schmidt, Junqi Yin, Michael Matheson, Bronson Messer, Mallikarjun Shankar

The design and construction of high performance computing (HPC) systems relies on exhaustive performance analysis and benchmarking. Traditionally this activity has been geared exclusively towards simulation scientists, who, unsurprisingly, have been the primary customers of HPC for decades. However, there is a large and growing volume of data science work that requires these large scale resources, and as such the calls for inclusion and investments in data for HPC have been increasing. So when designing a next generation HPC platform, it is necessary to have HPC-amenable big data analytics benchmarks. In this paper, we propose a set of big data analytics benchmarks and sample codes designed for testing the capabilities of current and next generation supercomputers.


          High Dimensional Clustering with $r$-nets. (arXiv:1811.02288v1 [cs.CG])      Cache   Translate Page      

Authors: Georgia Avarikioti, Alain Ryser, Yuyi Wang, Roger Wattenhofer

Clustering, a fundamental task in data science and machine learning, groups a set of objects in such a way that objects in the same cluster are closer to each other than to those in other clusters. In this paper, we consider a well-known structure, so-called $r$-nets, which rigorously captures the properties of clustering. We devise algorithms that improve the run-time of approximating $r$-nets in high-dimensional spaces with $\ell_1$ and $\ell_2$ metrics from $\tilde{O}(dn^{2-\Theta(\sqrt{\epsilon})})$ to $\tilde{O}(dn + n^{2-\alpha})$, where $\alpha = \Omega({\epsilon^{1/3}}/{\log(1/\epsilon)})$. These algorithms are also used to improve a framework that provides approximate solutions to other high dimensional distance problems. Using this framework, several important related problems can also be solved efficiently, e.g., $(1+\epsilon)$-approximate $k$th-nearest neighbor distance, $(4+\epsilon)$-approximate Min-Max clustering, $(4+\epsilon)$-approximate $k$-center clustering. In addition, we build an algorithm that $(1+\epsilon)$-approximates greedy permutations in time $\tilde{O}((dn + n^{2-\alpha}) \cdot \log{\Phi})$ where $\Phi$ is the spread of the input. This algorithm is used to $(2+\epsilon)$-approximate $k$-center with the same time complexity.


          Revealing Fine Structures of the Retinal Receptive Field by Deep Learning Networks. (arXiv:1811.02290v1 [q-bio.NC])      Cache   Translate Page      

Authors: Qi Yan, Yajing Zheng, Shanshan Jia, Yichen Zhang, Zhaofei Yu, Feng Chen, Yonghong Tian, Tiejun Huang, Jian K. Liu

Deep convolutional neural networks (CNNs) have demonstrated impressive performance on many visual tasks. Recently, they became useful models for the visual system in neuroscience. However, it is still not clear what are learned by CNNs in terms of neuronal circuits. When a deep CNN with many layers is used for the visual system, it is not easy to compare the structure components of CNN with possible neuroscience underpinnings due to highly complex circuits from the retina to higher visual cortex. Here we address this issue by focusing on single retinal ganglion cells with biophysical models and recording data from animals. By training CNNs with white noise images to predict neuronal responses, we found that fine structures of the retinal receptive field can be revealed. Specifically, convolutional filters learned are resembling biological components of the retinal circuit. This suggests that a CNN learning from one single retinal cell reveals a minimal neural network carried out in this cell. Furthermore, when CNNs learned from different cells are transferred between cells, there is a diversity of transfer learning performance, which indicates that CNNs are cell-specific. Moreover, when CNNs are transferred between different types of input images, here white noise v.s. natural images, transfer learning shows a good performance, which implies that CNN indeed captures the full computational ability of a single retinal cell for different inputs. Taken together, these results suggest that CNN could be used to reveal structure components of neuronal circuits, and provide a powerful model for neural system identification.


          Infrared and visible image fusion using a novel deep decomposition method. (arXiv:1811.02291v1 [cs.CV])      Cache   Translate Page      

Authors: Hui Li, Xiao-Jun Wu

Infrared and visible image fusion is an important problem in image fusion tasks which has been applied widely in many fields. To better preserve the useful information from source images, in this paper, we propose an effective image fusion framework using a novel deep decomposition method which based on Latent Low-Rank Representation(LatLRR). And this decomposition method is also named DDLatLRR. Firstly, the LatLRR is utilized to learn a project matrix which used to extract salient features. Then, the base part and multi-level detail parts are obtained by DDLatLRR. With adaptive fusion strategies, the fused base part and the fused detail parts are reconstructed. Finally, the fused image is obtained by combine the fused base part and the detail parts. Compared with other fusion methods experimentally, the proposed algorithm has better fusion performance than state-of-the-art fusion methods in both subjective and objective evaluation. The Code of our fusion method is available at https://github.com/exceptionLi/imagefusion_deepdecomposition


          Defeating the Downgrade Attack on Identity Privacy in 5G. (arXiv:1811.02293v1 [cs.CR])      Cache   Translate Page      

Authors: Mohsin Khan, Philip Ginzboorg, Kimmo Järvinen, Valtteri Niemi

3GPP Release 15, the first 5G standard, includes protection of user identity privacy against IMSI catchers. These protection mechanisms are based on public key encryption. Despite this protection, IMSI catching is still possible in LTE networks which opens the possibility of a downgrade attack on user identity privacy, where a fake LTE base station obtains the identity of a 5G user equipment. We propose (i) to use an existing pseudonym-based solution to protect user identity privacy of 5G user equipment against IMSI catchers in LTE and (ii) to include a mechanism for updating LTE pseudonyms in the public key encryption based 5G identity privacy procedure. The latter helps to recover from a loss of synchronization of LTE pseudonyms. Using this mechanism, pseudonyms in the user equipment and home network are automatically synchronized when the user equipment connects to 5G. Our mechanisms utilize existing LTE and 3GPP Release 15 messages and require modifications only in the user equipment and home network in order to provide identity privacy. Additionally, lawful interception requires minor patching in the serving network.


          Unboxing Mutually Recursive Type Definitions in OCaml. (arXiv:1811.02300v1 [cs.PL])      Cache   Translate Page      

Authors: Simon Colin, Rodolphe Lepigre, Gabriel Scherer

In modern OCaml, single-argument datatype declarations (variants with a single constructor, records with a single field) can sometimes be `unboxed'. This means that their memory representation is the same as their single argument (omitting the variant or record constructor and an indirection), thus achieving better time and memory efficiency.

However, in the case of generalized/guarded algebraic datatypes (GADTs), unboxing is not always possible due to a subtle assumption about the runtime representation of OCaml values. The current correctness check is incomplete, rejecting many valid definitions, in particular those involving mutually-recursive datatype declarations.

In this paper, we explain the notion of separability as a semantic for the unboxing criterion, and propose a set of inference rules to check separability. From these inference rules, we derive a new implementation of the unboxing check that properly supports mutually-recursive definitions.


          A Backstepping control strategy for constrained tendon driven robotic finger. (arXiv:1811.02301v1 [cs.SY])      Cache   Translate Page      

Authors: Kunal Sanjay Narkhede, Aashay Anil Bhise, IA Sainul, Sankha Deb

The task of controlling an underactuated robotic finger with a single tendon and a single actuator is difficult. Methods for controlling the class of underactuated systems are available in the literature. However, this particular system does not fall into the class of underactuated system. This paper presents a design change which introduces kinematic constraints into the system, making the system controllable. Backstepping control strategy is used to control the system. Simulation results are presented for single finger driven by a single actuator.


          Modular Materialisation of Datalog Programs. (arXiv:1811.02304v1 [cs.AI])      Cache   Translate Page      

Authors: Pan Hu, Boris Motik, Ian Horrocks

The semina\"ive algorithm can materialise all consequences of arbitrary datalog rules, and it also forms the basis for incremental algorithms that update a materialisation as the input facts change. Certain (combinations of) rules, however, can be handled much more efficiently using custom algorithms. To integrate such algorithms into a general reasoning approach that can handle arbitrary rules, we propose a modular framework for materialisation computation and its maintenance. We split a datalog program into modules that can be handled using specialised algorithms, and handle the remaining rules using the semina\"ive algorithm. We also present two algorithms for computing the transitive and the symmetric-transitive closure of a relation that can be used within our framework. Finally, we show empirically that our framework can handle arbitrary datalog programs while outperforming existing approaches, often by orders of magnitude.


          Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning. (arXiv:1811.02307v1 [cs.CV])      Cache   Translate Page      

Authors: Vasili Ramanishka, Yi-Ting Chen, Teruhisa Misu, Kate Saenko

Driving Scene understanding is a key ingredient for intelligent transportation systems. To achieve systems that can operate in a complex physical and social environment, they need to understand and learn how humans drive and interact with traffic scenes. We present the Honda Research Institute Driving Dataset (HDD), a challenging dataset to enable research on learning driver behavior in real-life environments. The dataset includes 104 hours of real human driving in the San Francisco Bay Area collected using an instrumented vehicle equipped with different sensors. We provide a detailed analysis of HDD with a comparison to other driving datasets. A novel annotation methodology is introduced to enable research on driver behavior understanding from untrimmed data sequences. As the first step, baseline algorithms for driver behavior detection are trained and tested to demonstrate the feasibility of the proposed task.


          Fast Adaptive Bilateral Filtering. (arXiv:1811.02308v1 [cs.CV])      Cache   Translate Page      

Authors: Ruturaj G. Gavaskar, Kunal N. Chaudhury

In the classical bilateral filter, a fixed Gaussian range kernel is used along with a spatial kernel for edge-preserving smoothing. We consider a generalization of this filter, the so-called adaptive bilateral filter, where the center and width of the Gaussian range kernel is allowed to change from pixel to pixel. Though this variant was originally proposed for sharpening and noise removal, it can also be used for other applications such as artifact removal and texture filtering. Similar to the bilateral filter, the brute-force implementation of its adaptive counterpart requires intense computations. While several fast algorithms have been proposed in the literature for bilateral filtering, most of them work only with a fixed range kernel. In this paper, we propose a fast algorithm for adaptive bilateral filtering, whose complexity does not scale with the spatial filter width. This is based on the observation that the concerned filtering can be performed purely in range space using an appropriately defined local histogram. We show that by replacing the histogram with a polynomial and the finite range-space sum with an integral, we can approximate the filter using analytic functions. In particular, an efficient algorithm is derived using the following innovations: the polynomial is fitted by matching its moments to those of the target histogram (this is done using fast convolutions), and the analytic functions are recursively computed using integration-by-parts. Our algorithm can accelerate the brute-force implementation by at least $20 \times$, without perceptible distortions in the visual quality. We demonstrate the effectiveness of our algorithm for sharpening, JPEG deblocking, and texture filtering.


          An Enhanced Multi-Objective Biogeography-Based Optimization Algorithm for Automatic Detection of Overlapping Communities in a Social Network with Node Attributes. (arXiv:1811.02309v1 [cs.SI])      Cache   Translate Page      

Authors: Ali Reihanian, Mohammad-Reza Feizi-Derakhshi, Hadi S. Aghdasi

Community detection is one of the most important and interesting issues in social network analysis. In recent years, simultaneous considering of nodes' attributes and topological structures of social networks in the process of community detection has attracted the attentions of many scholars, and this consideration has been recently used in some community detection methods to increase their efficiencies and to enhance their performances in finding meaningful and relevant communities. But the problem is that most of these methods tend to find non-overlapping communities, while many real-world networks include communities that often overlap to some extent. In order to solve this problem, an evolutionary algorithm called MOBBO-OCD, which is based on multi-objective biogeography-based optimization (BBO), is proposed in this paper to automatically find overlapping communities in a social network with node attributes with synchronously considering the density of connections and the similarity of nodes' attributes in the network. In MOBBO-OCD, an extended locus-based adjacency representation called OLAR is introduced to encode and decode overlapping communities. Based on OLAR, a rank-based migration operator along with a novel two-phase mutation strategy and a new double-point crossover are used in the evolution process of MOBBO-OCD to effectively lead the population into the evolution path. In order to assess the performance of MOBBO-OCD, a new metric called alpha_SAEM is proposed in this paper, which is able to evaluate the goodness of both overlapping and non-overlapping partitions with considering the two aspects of node attributes and linkage structure. Quantitative evaluations reveal that MOBBO-OCD achieves favorable results which are quite superior to the results of 15 relevant community detection algorithms in the literature.


          Kernel Regression for Graph Signal Prediction in Presence of Sparse Noise. (arXiv:1811.02314v1 [stat.ML])      Cache   Translate Page      

Authors: Arun Venkitaraman, Pascal Frossard, Saikat Chatterjee

In presence of sparse noise we propose kernel regression for predicting output vectors which are smooth over a given graph. Sparse noise models the training outputs being corrupted either with missing samples or large perturbations. The presence of sparse noise is handled using appropriate use of $\ell_1$-norm along-with use of $\ell_2$-norm in a convex cost function. For optimization of the cost function, we propose an iteratively reweighted least-squares (IRLS) approach that is suitable for kernel substitution or kernel trick due to availability of a closed form solution. Simulations using real-world temperature data show efficacy of our proposed method, mainly for limited-size training datasets.


          Stacked Penalized Logistic Regression for Selecting Views in Multi-View Learning. (arXiv:1811.02316v1 [stat.ML])      Cache   Translate Page      

Authors: Wouter van Loon, Marjolein Fokkema, Botond Szabo, Mark de Rooij

In multi-view learning, features are organized into multiple sets called views. Multi-view stacking (MVS) is an ensemble learning framework which learns a prediction function from each view separately, and then learns a meta-function which optimally combines the view-specific predictions. In case studies, MVS has been shown to increase prediction accuracy. However, the framework can also be used for selecting a subset of important views. We propose a method for selecting views based on MVS, which we call stacked penalized logistic regression (StaPLR). Compared to existing view-selection methods like the group lasso, StaPLR can make use of faster optimization algorithms and is easily parallelized. We show that nonnegativity constraints on the parameters of the function which combines the views are important for preventing unimportant views from entering the model. We investigate the view selection and classification performance of StaPLR and the group lasso through simulations, and consider two real data examples. We observe that StaPLR is less likely to select irrelevant views, leading to models that are sparser at the view level, but which have comparable or increased predictive performance.


          Recurrent Skipping Networks for Entity Alignment. (arXiv:1811.02318v1 [cs.CL])      Cache   Translate Page      

Authors: Lingbing Guo, Zequn Sun, Ermei Cao, Wei Hu

We consider the problem of learning knowledge graph (KG) embeddings for entity alignment (EA). Current methods use the embedding models mainly focusing on triple-level learning, which lacks the ability of capturing long-term dependencies existing in KGs. Consequently, the embedding-based EA methods heavily rely on the amount of prior (known) alignment, due to the identity information in the prior alignment cannot be efficiently propagated from one KG to another. In this paper, we propose RSN4EA (recurrent skipping networks for EA), which leverages biased random walk sampling for generating long paths across KGs and models the paths with a novel recurrent skipping network (RSN). RSN integrates the conventional recurrent neural network (RNN) with residual learning and can largely improve the convergence speed and performance with only a few more parameters. We evaluated RSN4EA on a series of datasets constructed from real-world KGs. Our experimental results showed that it outperformed a number of state-of-the-art embedding-based EA methods and also achieved comparable performance for KG completion.


          Fast Hyperparameter Optimization of Deep Neural Networks via Ensembling Multiple Surrogates. (arXiv:1811.02319v1 [cs.LG])      Cache   Translate Page      

Authors: Yang Li, Jiawei Jiang, Yingxia Shao, Bin Cui

The performance of deep neural networks crucially depends on good hyperparameter configurations. Bayesian optimization is a powerful framework for optimizing the hyperparameters of DNNs. These methods need sufficient evaluation data to approximate and minimize the validation error function of hyperparameters. However, the expensive evaluation cost of DNNs leads to very few evaluation data within a limited time, which greatly reduces the efficiency of Bayesian optimization. Besides, the previous researches focus on using the complete evaluation data to conduct Bayesian optimization, and ignore the intermediate evaluation data generated by early stopping methods. To alleviate the insufficient evaluation data problem, we propose a fast hyperparameter optimization method, HOIST, that utilizes both the complete and intermediate evaluation data to accelerate the hyperparameter optimization of DNNs. Specifically, we train multiple basic surrogates to gather information from the mixed evaluation data, and then combine all basic surrogates using weighted bagging to provide an accurate ensemble surrogate. Our empirical studies show that HOIST outperforms the state-of-the-art approaches on a wide range of DNNs, including feed forward neural networks, convolutional neural networks, recurrent neural networks, and variational autoencoder.


          Hierarchical Neural Network Architecture In Keyword Spotting. (arXiv:1811.02320v1 [cs.CL])      Cache   Translate Page      

Authors: Yixiao Qu, Sihao Xue, Zhenyi Ying, Hang Zhou, Jue Sun

Keyword Spotting (KWS) provides the start signal of ASR problem, and thus it is essential to ensure a high recall rate. However, its real-time property requires low computation complexity. This contradiction inspires people to find a suitable model which is small enough to perform well in multi environments. To deal with this contradiction, we implement the Hierarchical Neural Network(HNN), which is proved to be effective in many speech recognition problems. HNN outperforms traditional DNN and CNN even though its model size and computation complexity are slightly less. Also, its simple topology structure makes easy to deploy on any device.


          Elastic CoCoA: Scaling In to Improve Convergence. (arXiv:1811.02322v1 [cs.LG])      Cache   Translate Page      

Authors: Michael Kaufmann, Thomas Parnell, Kornilios Kourtis

In this paper we experimentally analyze the convergence behavior of CoCoA and show, that the number of workers required to achieve the highest convergence rate at any point in time, changes over the course of the training. Based on this observation, we build Chicle, an elastic framework that dynamically adjusts the number of workers based on feedback from the training algorithm, in order to select the number of workers that results in the highest convergence rate. In our evaluation of 6 datasets, we show that Chicle is able to accelerate the time-to-accuracy by a factor of up to 5.96x compared to the best static setting, while being robust enough to find an optimal or near-optimal setting automatically in most cases.


          Super-Identity Convolutional Neural Network for Face Hallucination. (arXiv:1811.02328v1 [cs.CV])      Cache   Translate Page      

Authors: Kaipeng Zhang, Zhanpeng Zhang, Chia-Wen Cheng, Winston H. Hsu, Yu Qiao, Wei Liu, Tong Zhang

Face hallucination is a generative task to super-resolve the facial image with low resolution while human perception of face heavily relies on identity information. However, previous face hallucination approaches largely ignore facial identity recovery. This paper proposes Super-Identity Convolutional Neural Network (SICNN) to recover identity information for generating faces closed to the real identity. Specifically, we define a super-identity loss to measure the identity difference between a hallucinated face and its corresponding high-resolution face within the hypersphere identity metric space. However, directly using this loss will lead to a Dynamic Domain Divergence problem, which is caused by the large margin between the high-resolution domain and the hallucination domain. To overcome this challenge, we present a domain-integrated training approach by constructing a robust identity metric for faces from these two domains. Extensive experimental evaluations demonstrate that the proposed SICNN achieves superior visual quality over the state-of-the-art methods on a challenging task to super-resolve 12$\times$14 faces with an 8$\times$ upscaling factor. In addition, SICNN significantly improves the recognizability of ultra-low-resolution faces.


          Traversing Virtual Network Functions from the Edge to the Core: An End-to-End Performance Analysis. (arXiv:1811.02330v1 [cs.NI])      Cache   Translate Page      

Authors: Emmanouil Fountoulakis, Qi Liao, Manuel Stein, Nikolaos Pappas

Future mobile networks supporting Internet of Things are expected to provide both high throughput and low latency to user-specific services. One way to overcome this challenge is to adopt network function virtualization and Multi-access edge computing (MEC). In this paper, we analyze an end-to-end communication system that consists of both MEC servers and a server at the core network hosting different types of virtual network functions. We develop a queueing model for the performance analysis of the system consisting of both processing and transmission flows. The system is decomposed into subsystems which are independently analyzed in order to approximate the behaviour of the original system. We provide closed-form expressions of the performance metrics such as system drop rate and average number of tasks in the system. Simulation results show that our approximation performs quite well. By evaluating the system under different scenarios, we provide insights for the decision making on traffic flow control and its impact on critical performance metrics.


          Learning to Embed Sentences Using Attentive Recursive Trees. (arXiv:1811.02338v1 [cs.CL])      Cache   Translate Page      

Authors: Jiaxin Shi, Lei Hou, Juanzi Li, Zhiyuan Liu, Hanwang Zhang

Sentence embedding is an effective feature representation for most deep learning-based NLP tasks. One prevailing line of methods is using recursive latent tree-structured networks to embed sentences with task-specific structures. However, existing models have no explicit mechanism to emphasize task-informative words in the tree structure. To this end, we propose an Attentive Recursive Tree model (AR-Tree), where the words are dynamically located according to their importance in the task. Specifically, we construct the latent tree for a sentence in a proposed important-first strategy, and place more attentive words nearer to the root; thus, AR-Tree can inherently emphasize important words during the bottom-up composition of the sentence embedding. We propose an end-to-end reinforced training strategy for AR-Tree, which is demonstrated to consistently outperform, or be at least comparable to, the state-of-the-art sentence embedding methods on three sentence understanding tasks.


          Risk-Sensitive Reinforcement Learning for URLLC Traffic in Wireless Networks. (arXiv:1811.02341v1 [cs.NI])      Cache   Translate Page      

Authors: Nesrine Ben-Khalifa, Mohamad Assaad, Mérouane Debbah

In this paper, we study the problem of dynamic channel allocation for URLLC traffic in a multi-user multi-channel wireless network where urgent packets have to be successfully transmitted in a timely manner. We formulate the problem as a finite-horizon Markov Decision Process with a stochastic constraint related to the QoS requirement, defined as the packet loss rate for each user. We propose a novel weighted formulation that takes into account both the total expected reward (number of successfully transmitted packets) and the risk which we define as the QoS requirement violation. First, we use the value iteration algorithm to find the optimal policy, which assumes a perfect knowledge of the controller of all the parameters, namely the channel statistics. We then propose a Q-learning algorithm where the controller learns the optimal policy without having knowledge of neither the CSI nor the channel statistics. We illustrate the performance of our algorithms with numerical studies.


          Resource Allocation for Device-to-Device Communications Underlaying Heterogeneous Cellular Networks Using Coalitional Games. (arXiv:1811.02350v1 [cs.NI])      Cache   Translate Page      

Authors: Yali Chen, Bo Ai, Yong Niu, Ke Guan, Zhu Han

Heterogeneous cellular networks (HCNs) with millimeter wave (mmWave) communications included are emerging as a promising candidate for the fifth generation mobile network. With highly directional antenna arrays, mmWave links are able to provide several-Gbps transmission rate. However, mmWave links are easily blocked without line of sight. On the other hand, D2D communications have been proposed to support many content based applications, and need to share resources with users in HCNs to improve spectral reuse and enhance system capacity. Consequently, an efficient resource allocation scheme for D2D pairs among both mmWave and the cellular carrier band is needed. In this paper, we first formulate the problem of the resource allocation among mmWave and the cellular band for multiple D2D pairs from the view point of game theory. Then, with the characteristics of cellular and mmWave communications considered, we propose a coalition formation game to maximize the system sum rate in statistical average sense. We also theoretically prove that our proposed game converges to a Nash-stable equilibrium and further reaches the near-optimal solution with fast convergence rate. Through extensive simulations under various system parameters, we demonstrate the superior performance of our scheme in terms of the system sum rate compared with several other practical schemes.


          An Incentive Analysis of some Bitcoin Fee Design. (arXiv:1811.02351v1 [cs.GT])      Cache   Translate Page      

Authors: Andrew Chi-Chih Yao

In the Bitcoin system, miners are incentivized to join the system and validate transactions through fees paid by the users. A simple "pay your bid" auction has been employed to determine the transaction fees. Recently, Lavi, Sattath and Zohar [LSZ17] proposed an alternative fee design, called the monopolistic price (MP) mechanism, aimed at improving the revenue for the miners. Although MP is not strictly incentive compatible (IC), they studied how close to IC the mechanism is for iid distributions, and conjectured that it is nearly IC asymptotically based on extensive simulations and some analysis. In this paper, we prove that the MP mechanism is nearly incentive compatible for any iid distribution as the number of users grows large. This holds true with respect to other attacks such as splitting bids. We also prove a conjecture in [LSZ17] that MP dominates the RSOP auction in revenue (originally defined in Goldberg et al. [GHKSW06] for digital goods). These results lend support to MP as a Bitcoin fee design candidate. Additionally, we explore some possible intrinsic correlations between incentive compatibility and revenue in general.


          An amplitudes-perturbation data augmentation method in convolutional neural networks for EEG decoding. (arXiv:1811.02353v1 [eess.SP])      Cache   Translate Page      

Authors: Xian-Rui Zhang, Meng-Ying Lei, Yang Li

Brain-Computer Interface (BCI) system provides a pathway between humans and the outside world by analyzing brain signals which contain potential neural information. Electroencephalography (EEG) is one of most commonly used brain signals and EEG recognition is an important part of BCI system. Recently, convolutional neural networks (ConvNet) in deep learning are becoming the new cutting edge tools to tackle the problem of EEG recognition. However, training an effective deep learning model requires a big number of data, which limits the application of EEG datasets with a small number of samples. In order to solve the issue of data insufficiency in deep learning for EEG decoding, we propose a novel data augmentation method that add perturbations to amplitudes of EEG signals after transform them to frequency domain. In experiments, we explore the performance of signal recognition with the state-of-the-art models before and after data augmentation on BCI Competition IV dataset 2a and our local dataset. The results show that our data augmentation technique can improve the accuracy of EEG recognition effectively.


          Code-switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation. (arXiv:1811.02356v1 [cs.CL])      Cache   Translate Page      

Authors: Ching-Ting Chang, Shun-Po Chuang, Hung-Yi Lee

Code-switching is about dealing with alternative languages in speech or text. It is partially speaker-depend and domain-related, so completely explaining the phenomenon by linguistic rules is challenging. Compared to monolingual tasks, insufficient data is an issue for code-switching. To mitigate the issue without expensive human annotation, we proposed an unsupervised method for code-switching data augmentation. By utilizing a generative adversarial network, we can generate intra-sentential code-switching sentences from monolingual sentences. We applied proposed method on two corpora, and the result shows that the generated code-switching sentences improve the performance of code-switching language models.


          Object 3D Reconstruction based on Photometric Stereo and Inverted Rendering. (arXiv:1811.02357v1 [cs.CV])      Cache   Translate Page      

Authors: Anish R. Khadka, Paolo Remagnino, Vasileios Argyriou

Methods for 3D reconstruction such as Photometric stereo recover the shape and reflectance properties using multiple images of an object taken with variable lighting conditions from a fixed viewpoint. Photometric stereo assumes that a scene is illuminated only directly by the illumination source. As result, indirect illumination effects due to inter-reflections introduce strong biases in the recovered shape. Our suggested approach is to recover scene properties in the presence of indirect illumination. To this end, we proposed an iterative PS method combined with a reverted Monte-Carlo ray tracing algorithm to overcome the inter-reflection effects aiming to separate the direct and indirect lighting. This approach iteratively reconstructs a surface considering both the environment around the object and its concavities. We demonstrate and evaluate our approach using three datasets and the overall results illustrate improvement over the classic PS approaches.


          Micro-Attention for Micro-Expression recognition. (arXiv:1811.02360v1 [cs.CV])      Cache   Translate Page      

Authors: Chongyang Wang, Min Peng, Tao Bi, Tong Chen

Micro-expression, for its high objectivity in emotion detection, has emerged to be a promising modality in affective computing. Recently, deep learning methods have been successfully introduced into micro-expression recognition areas. Whilst the higher recognition accuracy achieved with deep learning methods, substantial challenges in micro-expression recognition remain. Issues with the existence of micro expression in small-local areas on face and limited size of databases still constrain the recognition accuracy of such facial behavior. In this work, to tackle such challenges, we propose novel attention mechanism called micro-attention cooperating with residual network. Micro-attention enables the network to learn to focus on facial area of interest. Moreover, coping with small datasets, a simple yet efficient transfer learning approach is utilized to alleviate the overfitting risk. With an extensive experimental evaluation on two benchmarks (CASMEII, SAMM), we demonstrate the effectiveness of proposed micro-attention and push the boundary of automatic recognition of micro-expression.


          Kalman Filter Modifier for Neural Networks in Non-stationary Environments. (arXiv:1811.02361v1 [cs.LG])      Cache   Translate Page      

Authors: Honglin Li, Frieder Ganz, Shirin Enshaeifar, Payam Barnaghi

Learning in a non-stationary environment is an inevitable problem when applying machine learning algorithm to real world environment. Learning new tasks without forgetting the previous knowledge is a challenge issue in machine learning. We propose a Kalman Filter based modifier to maintain the performance of Neural Network models under non-stationary environments. The result shows that our proposed model can preserve the key information and adapts better to the changes. The accuracy of proposed model decreases by 0.4% in our experiments, while the accuracy of conventional model decreases by 90% in the drifts environment.


          Fast High-Dimensional Bilateral and Nonlocal Means Filtering. (arXiv:1811.02363v1 [cs.CV])      Cache   Translate Page      

Authors: Pravin Nair, Kunal. N. Chaudhury

Existing fast algorithms for bilateral and nonlocal means filtering mostly work with grayscale images. They cannot easily be extended to high-dimensional data such as color and hyperspectral images, patch-based data, flow-fields, etc. In this paper, we propose a fast algorithm for high-dimensional bilateral and nonlocal means filtering. Unlike existing approaches, where the focus is on approximating the data (using quantization) or the filter kernel (via analytic expansions), we locally approximate the kernel using weighted and shifted copies of a Gaussian, where the weights and shifts are inferred from the data. The algorithm emerging from the proposed approximation essentially involves clustering and fast convolutions, and is easy to implement. Moreover, a variant of our algorithm comes with a guarantee (bound) on the approximation error, which is not enjoyed by existing algorithms. We present some results for high-dimensional bilateral and nonlocal means filtering to demonstrate the speed and accuracy of our proposal. Moreover, we also show that our algorithm can outperform state-of-the-art fast approximations in terms of accuracy and timing.


          Effective Subword Segmentation for Text Comprehension. (arXiv:1811.02364v1 [cs.CL])      Cache   Translate Page      

Authors: Zhuosheng Zhang, Hai Zhao, Kangwei Ling, Jiangtong Li, Zuchao Li, Shexia He

Character-level representations have been broadly adopted to alleviate the problem of effectively representing rare or complex words. However, character itself is not a natural minimal linguistic unit for representation or word embedding composing due to ignoring the linguistic coherence of consecutive characters inside word. This paper presents a general subword-augmented embedding framework for learning and composing computationally-derived subword-level representations. We survey a series of unsupervised segmentation methods for subword acquisition and different subword-augmented strategies for text understanding, showing that subword-augmented embedding significantly improves our baselines in multiple text understanding tasks on both English and Chinese languages.


          A Description Logic Framework for Commonsense Conceptual Combination Integrating Typicality, Probabilities and Cognitive Heuristics. (arXiv:1811.02366v1 [cs.AI])      Cache   Translate Page      

Authors: Antonio Lieto, Gian Luca Pozzato

We propose a nonmonotonic Description Logic of typicality able to account for the phenomenon of concept combination of prototypical concepts. The proposed logic relies on the logic of typicality ALC TR, whose semantics is based on the notion of rational closure, as well as on the distributed semantics of probabilistic Description Logics, and is equipped with a cognitive heuristic used by humans for concept composition. We first extend the logic of typicality ALC TR by typicality inclusions whose intuitive meaning is that there is probability p about the fact that typical Cs are Ds. As in the distributed semantics, we define different scenarios containing only some typicality inclusions, each one having a suitable probability. We then focus on those scenarios whose probabilities belong to a given and fixed range, and we exploit such scenarios in order to ascribe typical properties to a concept C obtained as the combination of two prototypical concepts. We also show that reasoning in the proposed Description Logic is EXPTIME-complete as for the underlying ALC.


          Scalable Application- and User-aware Resource Allocation in Enterprise Networks Using End-host Pacing. (arXiv:1811.02367v1 [cs.NI])      Cache   Translate Page      

Authors: Christian Sieber, Susanna Schwarzmann, Andreas Blenk, Thomas Zinner, Wolfgang Kellerer

Scalable user- and application-aware resource allocation for heterogeneous applications sharing an enterprise network is still an unresolved problem. The main challenges are: (i) How to define user- and application-aware shares of resources? (ii) How to determine an allocation of shares of network resources to applications? (iii) How to allocate the shares per application in heterogeneous networks at scale? In this paper we propose solutions to the three challenges and introduce a system design for enterprise deployment. Defining the necessary resource shares per application is hard, as the intended use case and user's preferences influence the resource demand. Utility functions based on user experience enable a mapping of network resources in terms of throughput and latency budget to a common user-level utility scale. A multi-objective MILP is formulated to solve the throughput- and delay-aware embedding of each utility function for a max-min fairness criteria. The allocation of resources in traditional networks with policing and scheduling cannot distinguish large numbers of classes. We propose a resource allocation system design for enterprise networks based on Software-Defined Networking principles to achieve delay-constrained routing in the network and application pacing at the end-hosts. The system design is evaluated against best effort networks with applications competing for the throughput of a constrained link. The competing applications belong to the five application classes web browsing, file download, remote terminal work, video streaming, and Voice-over-IP. The results show that the proposed methodology improves the minimum and total utility, minimizes packet loss and queuing delay at bottlenecks, establishes fairness in terms of utility between applications, and achieves predictable application performance at high link utilization.


          Identifica\c{c}\~ao autom\'atica de picha\c{c}\~ao a partir de imagens urbanas. (arXiv:1811.02372v1 [cs.CV])      Cache   Translate Page      

Authors: Eric K. Tokuda, Claudio T. Silva, Roberto M. Cesar-Jr

Graffiti tagging is a common issue in great cities an local authorities are on the move to combat it. The tagging map of a city can be a useful tool as it may help to clean-up highly saturated regions and discourage future acts in the neighbourhood and currently there is no way of getting a tagging map of a region in an automatic fashion and manual inspection or crowd participation are required. In this work, we describe a work in progress in creating an automatic way to get a tagging map of a city or region. It is based on the use of street view images and on the detection of graffiti tags in the images.


          Sets of autoencoders with shared latent spaces. (arXiv:1811.02373v1 [cs.CV])      Cache   Translate Page      

Authors: Vasily Morzhakov

Autoencoders receive latent models of input data. It was shown in recent works that they also estimate probability density functions of the input. This fact makes using the Bayesian decision theory possible. If we obtain latent models of input data for each class or for some points in the space of parameters in a parameter estimation task, we are able to estimate likelihood functions for those classes or points in parameter space. We show how the set of autoencoders solves the recognition problem. Each autoencoder describes its own model or context, a latent vector that presents input data in the latent space may be called treatment in its context. Sharing latent spaces of autoencoders gives a very important property that is the ability to separate treatment and context where the input information is treated through the set of autoencoders. There are two remarkable and most valuable results of this work: a mechanism that shows a possible way of forming abstract concepts and a way of reducing dataset's size during training. These results are confirmed by tests presented in the article.


          Robust Bhattacharyya bound linear discriminant analysis through adaptive algorithm. (arXiv:1811.02384v1 [cs.LG])      Cache   Translate Page      

Authors: Chun-Na Li, Yuan-Hai Shao, Zhen Wang, Nai-Yang Deng

In this paper, we propose a novel linear discriminant analysis criterion via the Bhattacharyya error bound estimation based on a novel L1-norm (L1BLDA) and L2-norm (L2BLDA). Both L1BLDA and L2BLDA maximize the between-class scatters which are measured by the weighted pairwise distances of class means and meanwhile minimize the within-class scatters under the L1-norm and L2-norm, respectively. The proposed models can avoid the small sample size (SSS) problem and have no rank limit that may encounter in LDA. It is worth mentioning that, the employment of L1-norm gives a robust performance of L1BLDA, and L1BLDA is solved through an effective non-greedy alternating direction method of multipliers (ADMM), where all the projection vectors can be obtained once for all. In addition, the weighting constants of L1BLDA and L2BLDA between the between-class and within-class terms are determined by the involved data set, which makes our L1BLDA and L2BLDA adaptive. The experimental results on both benchmark data sets as well as the handwritten digit databases demonstrate the effectiveness of the proposed methods.


          Fine-grained Apparel Classification and Retrieval without rich annotations. (arXiv:1811.02385v1 [cs.CV])      Cache   Translate Page      

Authors: Aniket Bhatnagar, Sanchit Aggarwal

The ability to correctly classify and retrieve apparel images has a variety of applications important to e-commerce, online advertising and internet search. In this work, we propose a robust framework for fine-grained apparel classification, in-shop and cross-domain retrieval which eliminates the requirement of rich annotations like bounding boxes and human-joints or clothing landmarks, and training of bounding box/ key-landmark detector for the same. Factors such as subtle appearance differences, variations in human poses, different shooting angles, apparel deformations, and self-occlusion add to the challenges in classification and retrieval of apparel items. Cross-domain retrieval is even harder due to the presence of large variation between online shopping images, usually taken in ideal lighting, pose, positive angle and clean background as compared with street photos captured by users in complicated conditions with poor lighting and cluttered scenes. Our framework uses compact bilinear CNN with tensor sketch algorithm to generate embeddings that capture local pairwise feature interactions in a translationally invariant manner. For apparel classification, we pass the feature embeddings through a softmax classifier, while, the in-shop and cross-domain retrieval pipelines use a triplet-loss based optimization approach, such that squared Euclidean distance between embeddings measures the dissimilarity between the images. Unlike previous works that relied on bounding box, key clothing landmarks or human joint detectors to assist the final deep classifier, proposed framework can be trained directly on the provided category labels or generated triplets for triplet loss optimization. Lastly, Experimental results on the DeepFashion fine-grained categorization, and in-shop and consumer-to-shop retrieval datasets provide a comparative analysis with previous work performed in the domain.


          Local-Encoding-Preserving Secure Network Coding---Part I: Fixed Security Level. (arXiv:1811.02388v1 [cs.IT])      Cache   Translate Page      

Authors: Xuan Guang, Raymond W. Yeung, Fang-Wei Fu

Information-theoretic security is considered in the paradigm of network coding in the presence of wiretappers, who can access one arbitrary edge subset up to a certain size, also referred to as the security level. Secure network coding is applied to prevent the leakage of the source information to the wiretappers. In this two-part paper, we consider the problem of secure network coding when the information rate and the security level can change over time. In the current paper (i.e., Part I of the two-part paper), we focus on the problem for a fixed security level and a flexible rate. To efficiently solve this problem, we put forward local-encoding-preserving secure network coding, where a family of secure linear network codes (SLNCs) is called local-encoding-preserving if all the SLNCs in this family share a common local encoding kernel at each intermediate node in the network. We present an efficient approach for constructing upon an SLNC that exists a local-encoding-preserving SLNC with the same security level and the rate reduced by one. By applying this approach repeatedly, we can obtain a family of local-encoding-preserving SLNCs with a fixed security level and multiple rates. We also develop a polynomial-time algorithm for efficient implementation of this approach. Furthermore, it is proved that the proposed approach incurs no penalty on the required field size for the existence of SLNCs in terms of the best known lower bound by Guang and Yeung. The result in this paper will be used as a building block for efficiently constructing a family of local-encoding-preserving SLNCs for all possible pairs of rate and security level, which will be discussed in the companion paper (i.e., Part II of the two-part paper).


          Local-Encoding-Preserving Secure Network Coding---Part II: Flexible Rate and Security Level. (arXiv:1811.02390v1 [cs.IT])      Cache   Translate Page      

Authors: Xuan Guang, Raymond W. Yeung, Fang-Wei Fu

In the two-part paper, we consider the problem of secure network coding when the information rate and the security level can change over time. To efficiently solve this problem, we put forward local-encoding-preserving secure network coding, where a family of secure linear network codes (SLNCs) is called local-encoding-preserving (LEP) if all the SLNCs in this family share a common local encoding kernel at each intermediate node in the network. In this paper (Part II), we first consider the design of a family of LEP SLNCs for a fixed rate and a flexible security level. We present a novel and efficient approach for constructing upon an SLNC that exists an LEP SLNC with the same rate and the security level increased by one. Next, we consider the design of a family of LEP SLNCs for a fixed dimension (equal to the sum of rate and security level) and a flexible pair of rate and security level. We propose another novel approach for designing an SLNC such that the same SLNC can be applied for all the rate and security-level pairs with the fixed dimension. Also, two polynomial-time algorithms are developed for efficient implementations of our two approaches, respectively. Furthermore, we prove that both approaches do not incur any penalty on the required field size for the existence of SLNCs in terms of the best known lower bound by Guang and Yeung. Finally, we consider the ultimate problem of designing a family of LEP SLNCs that can be applied to all possible pairs of rate and security level. By combining the construction of a family of LEP SLNCs for a fixed security level and a flexible rate (obtained in Part I) with the constructions of the two families of LEP SLNCs in the current paper in suitable ways, we can obtain a family of LEP SLNCs that can be applied for all possible pairs of rate and security level. Three possible such constructions are presented.


          Towards digitalisation of summative and formative assessments in academic teaching of statistics. (arXiv:1811.02391v1 [cs.HC])      Cache   Translate Page      

Authors: Nils Schwinning, Michael Striewe, Till Massing, Christoph Hanck, Michael Goedicke

Web-based systems for assessment or homework are commonly used in many different domains. Several studies show that these systems can have positive effects on learning outcomes. Many research efforts also have made these systems quite flexible with respect to different item formats and exercise styles. However, there is still a lack of support for complex exercises in several domains at university level. Although there are systems that allow for quite sophisticated operations for generating exercise contents, there is less support for using similar operations for evaluating students' input and for feedback generation. This paper elaborates on filling this gap in the specific case of statistics. We present both the conceptional requirements for this specific domain as well as a fully implemented solution. Furthermore, we report on using this solution for formative and summative assessments in lectures with large numbers of participants at a big university.


          DeepChannel: Salience Estimation by Contrastive Learning for Extractive Document Summarization. (arXiv:1811.02394v1 [cs.CL])      Cache   Translate Page      

Authors: Jiaxin Shi, Chen Liang, Lei Hou, Juanzi Li, Zhiyuan Liu, Hanwang Zhang

We propose DeepChannel, a robust, data-efficient, and interpretable neural model for extractive document summarization. Given any document-summary pair, we estimate a salience score, which is modeled using an attention-based deep neural network, to represent the salience degree of the summary for yielding the document. We devise a contrastive training strategy to learn the salience estimation network, and then use the learned salience score as a guide and iteratively extract the most salient sentences from the document as our generated summary. In experiments, our model not only achieves state-of-the-art ROUGE scores on CNN/Daily Mail dataset, but also shows strong robustness in the out-of-domain test on DUC2007 test set. Moreover, our model reaches a ROUGE-1 F-1 score of 39.41 on CNN/Daily Mail test set with merely $1 / 100$ training set, demonstrating a tremendous data efficiency.


          A `Little Bit' Too Much? High Speed Imaging from Sparse Photon Counts. (arXiv:1811.02396v1 [cs.CV])      Cache   Translate Page      

Authors: Paramanand Chandramouli, Samuel Burri, Claudio Bruschini, Edoardo Charbon, Andreas Kolb

Recent advances in photographic sensing technologies have made it possible to achieve light detection in terms of a single photon. Photon counting sensors are being increasingly used in many diverse applications. We address the problem of jointly recovering spatial and temporal scene radiance from very few photon counts. Our ConvNet-based scheme effectively combines spatial and temporal information present in measurements to reduce noise. We demonstrate that using our method one can acquire videos at a high frame rate and still achieve good quality signal-to-noise ratio. Experiments show that the proposed scheme performs quite well in different challenging scenarios while the existing denoising schemes are unable to handle them.


          First Order Alternation. (arXiv:1811.02398v1 [cs.FL])      Cache   Translate Page      

Authors: Radu Iosif, Xiao Xu

We introduce first order alternating automata, a generalization of boolean alternating automata, in which transition rules are described by multisorted first order formulae, with states and internal variables given by uninterpreted predicate terms. The model is closed under union, intersection and complement, and its emptiness problem is undecidable, even for the simplest data theory of equality. To cope with this limitation, we develop an abstraction refinement semi-algorithm based on lazy annotation of the symbolic execution paths with interpolants, obtained by applying (i) quantifier elimination with witness term generation and (ii) Lyndon interpolation in the quantifier-free data theory with uninterpreted predicate symbols. This provides a method for checking inclusion of timed and finite-memory register automata, and emptiness of quantified predicate automata, previously used in the verification of parameterized concurrent programs, composed of replicated threads, with a shared-memory communication model.


          Architecture of Distributed Data Storage for Astroparticle Physics. (arXiv:1811.02403v1 [cs.DC])      Cache   Translate Page      

Authors: Alexander Kryukov, Andrey Demichev

For the successful development of the astrophysics and, accordingly, for obtaining more complete knowledge of the Universe, it is extremely important to combine and comprehensively analyze information of various types (e.g., about charged cosmic particles, gamma rays, neutrinos, etc.) obtained by using divers large-scale experimental setups located throughout the world. It is obvious that all kinds of activities must be performed continually across all stages of the data life cycle to help support effective data management, in particular, the collection and storage of data, its processing and analysis, refining the physical model, making preparations for publication, and data reprocessing taking refinement into account. In this paper we present a general approach to construction and the architecture of a system to be able to collect, store, and provide users' access to astrophysical data. We also suggest a new approach to the construction of a metadata registry based on the blockchain technology.


          User Specific Adaptation in Automatic Transcription of Vocalised Percussion. (arXiv:1811.02406v1 [cs.SD])      Cache   Translate Page      

Authors: António Ramires, Rui Penha, Matthew E. P. Davies

The goal of this work is to develop an application that enables music producers to use their voice to create drum patterns when composing in Digital Audio Workstations (DAWs). An easy-to-use and user-oriented system capable of automatically transcribing vocalisations of percussion sounds, called LVT - Live Vocalised Transcription, is presented. LVT is developed as a Max for Live device which follows the `segment-and-classify' methodology for drum transcription, and includes three modules: i) an onset detector to segment events in time; ii) a module that extracts relevant features from the audio content; and iii) a machine-learning component that implements the k-Nearest Neighbours (kNN) algorithm for the classification of vocalised drum timbres.

Due to the wide differences in vocalisations from distinct users for the same drum sound, a user-specific approach to vocalised transcription is proposed. In this perspective, a given end-user trains the algorithm with their own vocalisations for each drum sound before inputting their desired pattern into the DAW. The user adaption is achieved via a new Max external which implements Sequential Forward Selection (SFS) for choosing the most relevant features for a given set of input drum sounds.


          An audio-only method for advertisement detection in broadcast television content. (arXiv:1811.02411v1 [cs.SD])      Cache   Translate Page      

Authors: António Ramires, Diogo Cocharro, Matthew E. P. Davies

We address the task of advertisement detection in broadcast television content. While typically approached from a video-only or audio-visual perspective, we present an audio-only method. Our approach centres on the detection of short silences which exist at the boundaries between programming and advertising, as well as between the advertisements themselves. To identify advertising regions we first locate all points within the broadcast content with very low signal energy. Next, we use a multiple linear regression model to reject non-boundary silences based on features extracted from the local context immediately surrounding the silence. Finally, we determine the advertising regions based on the long-term grouping of detected boundary silences. When evaluated over a 26 hour annotated database covering national and commercial Portuguese television channels we obtain a Matthews correlation coefficient in excess of 0.87 and outperform a freely available audio-visual approach.


          Low-Rank Tensor Modeling for Hyperspectral Unmixing Accounting for Spectral Variability. (arXiv:1811.02413v1 [cs.CV])      Cache   Translate Page      

Authors: Tales Imbiriba, Ricardo Augusto Borsoi, José Carlos Moreira Bermudez

Traditional hyperspectral unmixing methods neglect the underlying variability of spectral signatures often obeserved in typical hyperspectral images, propagating these missmodeling errors throughout the whole unmixing process. Attempts to model material spectra as members of sets or as random variables tend to lead to severely ill-posed unmixing problems. To overcome this drawback, endmember variability has been handled through generalizations of the mixing model, combined with spatial regularization over the abundance and endmember estimations. Recently, tensor-based strategies considered low-rank decompositions of hyperspectral images as an alternative to impose low-dimensional structures on the solutions of standard and multitemporal unmixing problems. These strategies, however, present two main drawbacks: 1) they confine the solutions to low-rank tensors, which often cannot represent the complexity of real-world scenarios; and 2) they lack guarantees that endmembers and abundances will be correctly factorized in their respective tensors. In this work, we propose a more flexible approach, called ULTRA-V, that imposes low-rank structures through regularizations whose strictness is controlled by scalar parameters. Simulations attest the superior accuracy of the method when compared with state-of-the-art unmixing algorithms that account for spectral variability.


          FPT-algorithms for computing Gromov-Hausdorff and interleaving distances between trees. (arXiv:1811.02425v1 [cs.CG])      Cache   Translate Page      

Authors: Elena Farahbakhsh Touli, Yusu Wang

Gromov-Hausdorff (GH) distance is a natural way to measure the distortion between two metric spaces. However, there has been only limited algorithmic development to compute or approximate this distance. We focus on computing the Gromov-Hausdorff distance between two metric trees. Roughly speaking, a metric tree is a metric space that can be realized by the shortest path metric on a tree. Previously, Agarwal et al. showed that even for trees with unit edge length, it is NP hard to approximate the GH distance between them within a factor of 3. In this paper, we present a fixed-parameter tractable (FPT) algorithm that can approximate the GH distance between two general metric trees within a factor of 14.

Interestingly, the development of our algorithm is made possible by a connection between the GH distance for metric trees and the interleaving distance for the so-called merge trees. The merge trees arise in practice naturally as a simple yet meaningful topological summary, and are of independent interest. It turns out that an exact or approximation algorithm for the interleaving distance leads to an approximation algorithm for the Gromov-Hausdorff distance. One of the key contributions of our work is that we re-define the interleaving distance in a way that makes it easier to develop dynamic programming approaches to compute it. We then present a FPT algorithm to compute the interleaving distance between two merge trees exactly, which ultimately leads to an FPT-algorithm to approximate the GH distance between two metric trees. This exact FPT-algorithm to compute the interleaving distance between merge trees is of interest itself, as it is known that it is NP-hard to approximate it within a factor of 3, and previously the best known algorithm has an approximation factor of $O(\sqrt{n})$ even for trees with unit edge length.


          Automatic Repair of Real Bugs in Java: A Large-Scale Experiment on the Defects4J Dataset. (arXiv:1811.02429v1 [cs.SE])      Cache   Translate Page      

Authors: Matias Martinez, Thomas Durieux, Romain Sommerard, Jifeng Xuan, Martin Monperrus

Defects4J is a large, peer-reviewed, structured dataset of real-world Java bugs. Each bug in Defects4J comes with a test suite and at least one failing test case that triggers the bug. In this paper, we report on an experiment to explore the effectiveness of automatic test-suite based repair on Defects4J. The result of our experiment shows that the considered state-of-the-art repair methods can generate patches for 47 out of 224 bugs. However, those patches are only test-suite adequate, which means that they pass the test suite and may potentially be incorrect beyond the test-suite satisfaction correctness criterion. We have manually analyzed 84 different patches to assess their real correctness. In total, 9 real Java bugs can be correctly repaired with test-suite based repair. This analysis shows that test-suite based repair suffers from under-specified bugs, for which trivial or incorrect patches still pass the test suite. With respect to practical applicability, it takes on average 14.8 minutes to find a patch. The experiment was done on a scientific grid, totaling 17.6 days of computation time. All the repair systems and experimental results are publicly available on Github in order to facilitate future research on automatic repair.


          WordNet-feelings: A linguistic categorisation of human feelings. (arXiv:1811.02435v1 [cs.CL])      Cache   Translate Page      

Authors: Advaith Siddharthan, Nicolas Cherbuin, Paul J. Eslinger, Kasia Kozlowska, Nora A. Murphy, Leroy Lowe

In this article, we present the first in depth linguistic study of human feelings. While there has been substantial research on incorporating some affective categories into linguistic analysis (e.g. sentiment, and to a lesser extent, emotion), the more diverse category of human feelings has thus far not been investigated. We surveyed the extensive interdisciplinary literature around feelings to construct a working definition of what constitutes a feeling and propose 9 broad categories of feeling. We identified potential feeling words based on their pointwise mutual information with morphological variants of the word `feel' in the Google n-gram corpus, and present a manual annotation exercise where 317 WordNet senses of one hundred of these words were categorised as `not a feeling' or as one of the 9 proposed categories of feeling. We then proceeded to annotate 11386 WordNet senses of all these words to create WordNet-feelings, a new affective dataset that identifies 3664 word senses as feelings, and associates each of these with one of the 9 categories of feeling. WordNet-feelings can be used in conjunction with other datasets such as SentiWordNet that annotate word senses with complementary affective properties such as valence and intensity.


          Trainable Adaptive Window Switching for Speech Enhancement. (arXiv:1811.02438v1 [eess.AS])      Cache   Translate Page      

Authors: Yuma Koizumi, Noboru Harada, Yoichi Haneda

This study proposes a trainable adaptive window switching (AWS) method and apply it to a deep-neural-network (DNN) for speech enhancement in the modified discrete cosine transform domain. Time-frequency (T-F) mask processing in the short-time Fourier transform (STFT)-domain is a typical speech enhancement method. To recover the target signal precisely, DNN-based short-time frequency transforms have recently been investigated and used instead of the STFT. However, since such a fixed-resolution short-time frequency transform method has a T-F resolution problem based on the uncertainty principle, not only the short-time frequency transform but also the length of the windowing function should be optimized. To overcome this problem, we incorporate AWS into the speech enhancement procedure, and the windowing function of each time-frame is manipulated using a DNN depending on the input signal. We confirmed that the proposed method achieved a higher signal-to-distortion ratio than conventional speech enhancement methods in fixed-resolution frequency domains.


          Gradual Type Theory (Extended Version). (arXiv:1811.02440v1 [cs.PL])      Cache   Translate Page      

Authors: Max S. New, Daniel R. Licata, Amal Ahmed

Gradually typed languages are designed to support both dynamically typed and statically typed programming styles while preserving the benefits of each. While existing gradual type soundness theorems for these languages aim to show that type-based reasoning is preserved when moving from the fully static setting to a gradual one, these theorems do not imply that correctness of type-based refactorings and optimizations is preserved. Establishing correctness of program transformations is technically difficult, and is often neglected in the metatheory of gradual languages.

In this paper, we propose an axiomatic account of program equivalence in a gradual cast calculus, which we formalize in a logic we call gradual type theory (GTT). Based on Levy's call-by-push-value, GTT gives an axiomatic account of both call-by-value and call-by-name gradual languages. We then prove theorems that justify optimizations and refactorings in gradually typed languages. For example, uniqueness principles for gradual type connectives show that if the $\beta\eta$ laws hold for a connective, then casts between that connective must be equivalent to the lazy cast semantics. Contrapositively, this shows that eager cast semantics violates the extensionality of function types. As another example, we show that gradual upcasts are pure and dually, gradual downcasts are strict. We show the consistency and applicability of our theory by proving that an implementation using the lazy cast semantics gives a logical relations model of our type theory, where equivalence in GTT implies contextual equivalence of the programs. Since GTT also axiomatizes the dynamic gradual guarantee, our model also establishes this central theorem of gradual typing. The model is parametrized by the implementation of the dynamic types, and so gives a family of implementations that validate type-based optimization and the gradual guarantee.


          Meta Distribution of Downlink Non-Orthogonal Multiple Access (NOMA) in Poisson Networks. (arXiv:1811.02443v1 [cs.IT])      Cache   Translate Page      

Authors: Konpal Shaukat Ali, Hesham ElSawy, Mohamed-Slim Alouini

We study the meta distribution (MD) of the coverage probability (CP) in downlink non-orthogonal-multiple-access (NOMA) networks. Two schemes are assessed based on the location of the NOMA users: 1) anywhere in the network, 2) cell-center users only. The moments of the MD for both schemes are derived and the MD is approximated via the beta distribution. Closed-form moments are derived for the first scheme; for the second scheme exact and approximate moments, to simplify the integral calculation, are derived. We show that restricting NOMA to cell-center users provides significantly higher mean, lower variance and better percentile performance for the CP.


          Blameworthiness in Games with Imperfect Information. (arXiv:1811.02446v1 [cs.AI])      Cache   Translate Page      

Authors: Pavel Naumov, Jia Tao

Blameworthiness of an agent or a coalition of agents is often defined in terms of the principle of alternative possibilities: for the coalition to be responsible for an outcome, the outcome must take place and the coalition should have had a strategy to prevent it. In this paper we argue that in the settings with imperfect information, not only should the coalition have had a strategy, but it also should have known that it had a strategy, and it should have known what the strategy was.

The main technical result of the paper is a sound and complete bimodal logic that describes the interplay between knowledge and blameworthiness in strategic games with imperfect information.


          Multi-Level Sensor Fusion with Deep Learning. (arXiv:1811.02447v1 [cs.CV])      Cache   Translate Page      

Authors: Valentin Vielzeuf, Alexis Lechervy, Stéphane Pateux, Frédéric Jurie

In the context of deep learning, this article presents an original deep network, namely CentralNet, for the fusion of information coming from different sensors. This approach is designed to efficiently and automatically balance the trade-off between early and late fusion (i.e. between the fusion of low-level vs high-level information). More specifically, at each level of abstraction-the different levels of deep networks-uni-modal representations of the data are fed to a central neural network which combines them into a common embedding. In addition, a multi-objective regularization is also introduced, helping to both optimize the central network and the unimodal networks. Experiments on four multimodal datasets not only show state-of-the-art performance, but also demonstrate that CentralNet can actually choose the best possible fusion strategy for a given problem.


          Synaptic Strength For Convolutional Neural Network. (arXiv:1811.02454v1 [cs.LG])      Cache   Translate Page      

Authors: Chen Lin, Zhao Zhong, Wei Wu, Junjie Yan

Convolutional Neural Networks(CNNs) are both computation and memory intensive which hindered their deployment in mobile devices. Inspired by the relevant concept in neural science literature, we propose Synaptic Pruning: a data-driven method to prune connections between input and output feature maps with a newly proposed class of parameters called Synaptic Strength. Synaptic Strength is designed to capture the importance of a connection based on the amount of information it transports. Experiment results show the effectiveness of our approach. On CIFAR-10, we prune connections for various CNN models with up to 96% , which results in significant size reduction and computation saving. Further evaluation on ImageNet demonstrates that synaptic pruning is able to discover efficient models which is competitive to state-of-the-art compact CNNs such as MobileNet-V2 and NasNet-Mobile. Our contribution is summarized as following: (1) We introduce Synaptic Strength, a new class of parameters for CNNs to indicate the importance of each connections. (2) Our approach can prune various CNNs with high compression without compromising accuracy. (3) Further investigation shows, the proposed Synaptic Strength is a better indicator for kernel pruning compared with the previous approach in both empirical result and theoretical analysis.


          On the Number of Order Types in Integer Grids of Small Size. (arXiv:1811.02455v1 [cs.CG])      Cache   Translate Page      

Authors: Luis E. Caraballo, José-Miguel Díaz-Báñez, Ruy Fabila-Monroy, Carlos Hidalgo-Toscano, Jesús Leaños, Amanda Montejano

Let $\{p_1,\dots,p_n\}$ and $\{q_1,\dots,q_n\}$ be two sets of $n$ labeled points in general position in the plane. We say that these two point sets have the same order type if for every triple of indices $(i,j,k)$, $p_k$ is above the directed line from $p_i$ to $p_j$ if and only if $q_k$ is above the directed line from $q_i$ to $q_j$. In this paper we give the first non-trivial lower bounds on the number of different order types of $n$ points that can be realized in integer grids of polynomial


          Semantic Term "Blurring" and Stochastic "Barcoding" for Improved Unsupervised Text Classification. (arXiv:1811.02456v1 [cs.CL])      Cache   Translate Page      

Authors: Robert Frank Martorano III

The abundance of text data being produced in the modern age makes it increasingly important to intuitively group, categorize, or classify text data by theme for efficient retrieval and search. Yet, the high dimensionality and imprecision of text data, or more generally language as a whole, prove to be challenging when attempting to perform unsupervised document clustering. In this thesis, we present two novel methods for improving unsupervised document clustering/classification by theme. The first is to improve document representations. We look to exploit "term neighborhoods" and "blur" semantic weight across neighboring terms. These neighborhoods are located in the semantic space afforded by "word embeddings." The second method is for cluster revision, based on what we deem as "stochastic barcoding", or "S- Barcode" patterns. Text data is inherently high dimensional, yet clustering typically takes place in a low dimensional representation space. Our method utilizes lower dimension clustering results as initial cluster configurations, and iteratively revises the configuration in the high dimensional space. We show with experimental results how both of the two methods improve the quality of document clustering. While this thesis elaborates on the two new conceptual contributions, a joint thesis by David Yan details the feature transformation and software architecture we developed for unsupervised document classification.


          Tunneling on Wheeler Graphs. (arXiv:1811.02457v1 [cs.DS])      Cache   Translate Page      

Authors: Jarno Alanko, Travis Gagie, Gonzalo Navarro, Louisa Seelbach Benkner

The Burrows-Wheeler Transform (BWT) is an important technique both in data compression and in the design of compact indexing data structures. It has been generalized from single strings to collections of strings and some classes of labeled directed graphs, such as tries and de Bruijn graphs. The BWTs of repetitive datasets are often compressible using run-length compression, but recently Baier (CPM 2018) described how they could be even further compressed using an idea he called tunneling. In this paper we show that tunneled BWTs can still be used for indexing and extend tunneling to the BWTs of Wheeler graphs, a framework that includes all the generalizations mentioned above.


          A Novel Variational Family for Hidden Nonlinear Markov Models. (arXiv:1811.02459v1 [stat.ML])      Cache   Translate Page      

Authors: Daniel Hernandez, Antonio Khalil Moretti, Ziqiang Wei, Shreya Saxena, John Cunningham, Liam Paninski

Latent variable models have been widely applied for the analysis and visualization of large datasets. In the case of sequential data, closed-form inference is possible when the transition and observation functions are linear. However, approximate inference techniques are usually necessary when dealing with nonlinear dynamics and observation functions. Here, we propose a novel variational inference framework for the explicit modeling of time series, Variational Inference for Nonlinear Dynamics (VIND), that is able to uncover nonlinear observation and transition functions from sequential data. The framework includes a structured approximate posterior, and an algorithm that relies on the fixed-point iteration method to find the best estimate for latent trajectories. We apply the method to several datasets and show that it is able to accurately infer the underlying dynamics of these systems, in some cases substantially outperforming state-of-the-art methods.


          Constraint-Driven Coordinated Control of Multi-Robot Systems. (arXiv:1811.02465v1 [cs.RO])      Cache   Translate Page      

Authors: Gennaro Notomista, Magnus Egerstedt

In this paper we present a reformulation--framed as a constrained optimization problem--of multi-robot tasks which are encoded through a cost function that is to be minimized. The advantages of this approach are multiple. The constraint-based formulation provides a natural way of enabling long-term robot autonomy applications, where resilience and adaptability to changing environmental conditions are essential. Moreover, under certain assumptions on the cost function, the resulting controller is guaranteed to be decentralized. Furthermore, finite-time convergence can be achieved, while using local information only, and therefore preserving the decentralized nature of the algorithm. The developed control framework has been tested on a team of ground mobile robots implementing long-term environmental monitoring.


          Convolutional LSTMs for Cloud-Robust Segmentation of Remote Sensing Imagery. (arXiv:1811.02471v1 [cs.CV])      Cache   Translate Page      

Authors: Marc Rußwurm, Marco Körner

Dynamic spatiotemporal processes on the Earth can be observed by an increasing number of optical Earth observation satellites that measure spectral reflectance at multiple spectral bands in regular intervals. Clouds partially covering the surface is an omnipresent challenge for the majority of remote sensing approaches that are not robust regarding cloud coverage. In these approaches, clouds are typically handled by cherry-picking cloud-free observations or by pre-classification of cloudy pixels and subsequent masking. In this work, we demonstrate the robustness of a straightforward convolutional long short-term memory network for vegetation classification using all available cloudy and non-cloudy satellite observations. We visualize the internal gate activations within the recurrent cells and find that, in some cells, modulation and input gates close on cloudy pixels. This indicates that the network has internalized a cloud-filtering mechanism without being specifically trained on cloud labels. The robustness regarding clouds is further demonstrated by experiments on sequences with varying degrees of cloud coverage where our network achieved similar accuracies on all cloudy and non-cloudy datasets. Overall, our results question the necessity of sophisticated pre-processing pipelines if robust classification methods are utilized.


          User equilibrium with a policy-based link transmission model for stochastic time-dependent traffic networks. (arXiv:1811.02474v1 [cs.SY])      Cache   Translate Page      

Authors: Hemant Gehlot, Satish V. Ukkusuri

Non-recurrent congestion is a major problem in traffic networks that causes unexpected delays during travels. In such a scenario, it is preferable to use adaptive paths or policies where next link decisions on reaching junctions are continuously adapted based on the information gained with time. In this paper, we study a traffic assignment problem in stochastic time-dependent networks. The problem is modeled as a fixed-point problem and existence of the equilibrium solution is discussed. We iteratively solve the problem using the method of successive averages (MSA). A novel network loading model inspired from Link transmission model is developed that accepts policies as inputs for solving the problem. This network loading model is different from the existing network loading models that take predefined paths for input flows. We demonstrate through numerical tests that solving traffic assignment problem with the proposed loading modeling scheme is more efficient as compared to solving the problem using path-based network loading models.


          Evolvement Constrained Adversarial Learning for Video Style Transfer. (arXiv:1811.02476v1 [cs.CV])      Cache   Translate Page      

Authors: Wenbo Li, Longyin Wen, Xiao Bian, Siwei Lyu

Video style transfer is a useful component for applications such as augmented reality, non-photorealistic rendering, and interactive games. Many existing methods use optical flow to preserve the temporal smoothness of the synthesized video. However, the estimation of optical flow is sensitive to occlusions and rapid motions. Thus, in this work, we introduce a novel evolve-sync loss computed by evolvements to replace optical flow. Using this evolve-sync loss, we build an adversarial learning framework, termed as Video Style Transfer Generative Adversarial Network (VST-GAN), which improves upon the MGAN method for image style transfer for more efficient video style transfer. We perform extensive experimental evaluations of our method and show quantitative and qualitative improvements over the state-of-the-art methods.


          Proofs of life: molecular-biology reasoning simulates cell behaviors from first principles. (arXiv:1811.02478v1 [q-bio.OT])      Cache   Translate Page      

Authors: René Vestergaard, Emmanuel Pietriga

Science relies on external correctness: statistical analysis and reproducibility, with ready applicability but inherent false positives/negatives. Mathematics uses internal correctness: conclusions must be established by detailed reasoning, with high confidence and deep insights but not necessarily real-world significance. Here, we formalize the molecular-biology reasoning style; establish that it constitutes an executable first-principle theory of cell behaviors that admits predictive technologies, with a range of correctness guarantees; and show that we can fully account for the standard reference: Ptashne, A Genetic Switch. Everything works for principled reasons and is presented within an open-ended meta-theoretic framework that seemingly applies to any reductionist discipline. The framework is adapted from a century-long line of work on mathematical reasoning. The key step is to not admit reasoning based on an external notion of truth but work only with what can be justified from considered assumptions. For molecular biology, the induced theory involves the concurrent running/interference of molecule-coded elementary processes of physiology change over the genome. The life cycle of the single-celled monograph organism is predicted in molecular detail as the aggregate of the possible sequentializations of the coded-for processes. The difficult question of molecular coding, i.e., the specific means of gene regulation, is addressed via a detailed modeling methodology. We establish a complementary perspective on science, complete with a proven correctness notion, and use it to make progress on long-standing and critical open problems in biology.


          Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments. (arXiv:1811.02480v1 [cs.CL])      Cache   Translate Page      

Authors: Giovanni Morrone, Luca Pasa, Vadim Tikhanoff, Sonia Bergamaschi, Luciano Fadiga, Leonardo Badino

In this paper, we address the problem of enhancing the speech of a speaker of interest in a cocktail party scenario when visual information of the speaker of interest is available. Contrary to most previous studies, we do not learn visual features on the typically small audio-visual datasets, but use an already available face landmark detector (trained on a separate image dataset). The landmarks are used by LSTM-based models to generate time-frequency masks which are applied to the acoustic mixed-speech spectrogram. Results show that: (i) landmark motion features are very effective features for this task, (ii) similarly to previous work, reconstruction of the target speaker's spectrogram mediated by masking is significantly more accurate than direct spectrogram reconstruction, and (iii) the best masks depend on both motion landmark features and the input mixed-speech spectrogram. To the best of our knowledge, our proposed models are the first models trained and evaluated on the limited size GRID and TCD-TIMIT datasets, that achieve speaker-independent speech enhancement in a multi-talker setting.


          Deep Reinforcement Learning for Green Security Games with Real-Time Information. (arXiv:1811.02483v1 [cs.MA])      Cache   Translate Page      

Authors: Yufei Wang, Zheyuan Ryan Shi, Lantao Yu, Yi Wu, Rohit Singh, Lucas Joppa, Fei Fang

Green Security Games (GSGs) have been proposed and applied to optimize patrols conducted by law enforcement agencies in green security domains such as combating poaching, illegal logging and overfishing. However, real-time information such as footprints and agents' subsequent actions upon receiving the information, e.g., rangers following the footprints to chase the poacher, have been neglected in previous work. To fill the gap, we first propose a new game model GSG-I which augments GSGs with sequential movement and the vital element of real-time information. Second, we design a novel deep reinforcement learning-based algorithm, DeDOL, to compute a patrolling strategy that adapts to the real-time information against a best-responding attacker. DeDOL is built upon the double oracle framework and the policy-space response oracle, solving a restricted game and iteratively adding best response strategies to it through training deep Q-networks. Exploring the game structure, DeDOL uses domain-specific heuristic strategies as initial strategies and constructs several local modes for efficient and parallelized training. To our knowledge, this is the first attempt to use Deep Q-Learning for security games.


          Concept Learning with Energy-Based Models. (arXiv:1811.02486v1 [cs.AI])      Cache   Translate Page      

Authors: Igor Mordatch

Many hallmarks of human intelligence, such as generalizing from limited experience, abstract reasoning and planning, analogical reasoning, creative problem solving, and capacity for language require the ability to consolidate experience into concepts, which act as basic building blocks of understanding and reasoning. We present a framework that defines a concept by an energy function over events in the environment, as well as an attention mask over entities participating in the event. Given few demonstration events, our method uses inference-time optimization procedure to generate events involving similar concepts or identify entities involved in the concept. We evaluate our framework on learning visual, quantitative, relational, temporal concepts from demonstration events in an unsupervised manner. Our approach is able to successfully generate and identify concepts in a few-shot setting and resulting learned concepts can be reused across environments. Example videos of our results are available at sites.google.com/site/energyconceptmodels


          Unifying Probabilistic Models for Time-Frequency Analysis. (arXiv:1811.02489v1 [eess.SP])      Cache   Translate Page      

Authors: William J. Wilkinson, Michael Riis Andersen, Joshua D. Reiss, Dan Stowell, Arno Solin

In audio signal processing, probabilistic time-frequency models have many benefits over their non-probabilistic counterparts. They adapt to the incoming signal, quantify uncertainty, and measure correlation between the signal's amplitude and phase information, making time domain resynthesis straightforward. However, these models are still not widely used since they come at a high computational cost, and because they are formulated in such a way that it can be difficult to interpret all the modelling assumptions. By showing their equivalence to Spectral Mixture Gaussian processes, we illuminate the underlying model assumptions and provide a general framework for constructing more complex models that better approximate real-world signals. Our interpretation makes it intuitive to inspect, compare, and alter the models since all prior knowledge is encoded in the Gaussian process kernel functions. We utilise a state space representation to perform efficient inference via Kalman smoothing, and we demonstrate how our interpretation allows for efficient parameter learning in the frequency domain.


          Mobile Data Science: Towards Understanding Data-Driven Intelligent Mobile Applications. (arXiv:1811.02491v1 [cs.CY])      Cache   Translate Page      

Authors: Iqbal H. Sarker

Due to the popularity of smart mobile phones and context-aware technology, various contextual data relevant to users' diverse activities with mobile phones is available around us. This enables the study on mobile phone data and context-awareness in computing, for the purpose of building data-driven intelligent mobile applications, not only on a single device but also in a distributed environment for the benefit of end users. Based on the availability of mobile phone data, and the usefulness of data-driven applications, in this paper, we discuss about mobile data science that involves in collecting the mobile phone data from various sources and building data-driven models using machine learning techniques, in order to make dynamic decisions intelligently in various day-to-day situations of the users. For this, we first discuss the fundamental concepts and the potentiality of mobile data science to build intelligent applications. We also highlight the key elements and explain various key modules involving in the process of mobile data science. This article is the first in the field to draw a big picture, and thinking about mobile data science, and it's potentiality in developing various data-driven intelligent mobile applications. We believe this study will help both the researchers and application developers for building smart data-driven mobile applications, to assist the end mobile phone users in their daily activities.


          Towards continual learning in medical imaging. (arXiv:1811.02496v1 [cs.CV])      Cache   Translate Page      

Authors: Chaitanya Baweja, Ben Glocker, Konstantinos Kamnitsas

This work investigates continual learning of two segmentation tasks in brain MRI with neural networks. To explore in this context the capabilities of current methods for countering catastrophic forgetting of the first task when a new one is learned, we investigate elastic weight consolidation, a recently proposed method based on Fisher information, originally evaluated on reinforcement learning of Atari games. We use it to sequentially learn segmentation of normal brain structures and then segmentation of white matter lesions. Our findings show this recent method reduces catastrophic forgetting, while large room for improvement exists in these challenging settings for continual learning.


          Variational Bayes Inference in Digital Receivers. (arXiv:1811.02506v1 [cs.IT])      Cache   Translate Page      

Authors: Viet Hung Tran

The digital telecommunications receiver is an important context for inference methodology, the key objective being to minimize the expected loss function in recovering the transmitted information. For that criterion, the optimal decision is the Bayesian minimum-risk estimator. However, the computational load of the Bayesian estimator is often prohibitive and, hence, efficient computational schemes are required. The design of novel schemes, striking new balances between accuracy and computational load, is the primary concern of this thesis. Two popular techniques, one exact and one approximate, will be studied.

The exact scheme is a recursive one, namely the generalized distributive law (GDL), whose purpose is to distribute all operators across the conditionally independent (CI) factors of the joint model, so as to reduce the total number of operators required. In a novel theorem derived in this thesis, GDL, if applicable, will be shown to guarantee such a reduction in all cases. An associated lemma also quantifies this reduction. For practical use, two novel algorithms, namely the no-longer-needed (NLN) algorithm and the generalized form of the Markovian Forward-Backward (FB) algorithm, recursively factorizes and computes the CI factors of an arbitrary model, respectively.

The approximate scheme is an iterative one, namely the Variational Bayes (VB) approximation, whose purpose is to find the independent (i.e. zero-order Markov) model closest to the true joint model in the minimum Kullback-Leibler divergence (KLD) sense. Despite being computationally efficient, this naive mean field approximation confers only modest performance for highly correlated models. A novel approximation, namely Transformed Variational Bayes (TVB), will be designed in the thesis in order to relax the zero-order constraint in the VB approximation, further reducing the KLD of the optimal approximation.


          Superregular grammars do not provide additional explanatory power but allow for a compact analysis of animal song. (arXiv:1811.02507v1 [q-bio.NC])      Cache   Translate Page      

Authors: Takashi Morita, Hiroki Koda

A pervasive belief with regard to the differences between human language and animal vocal sequences (song) is that they belong to different classes of computational complexity, with animal song belonging to regular languages, whereas human language is superregular. This argument, however, lacks empirical evidence since superregular analyses of animal song are understudied. The goal of this paper is to perform a superregular analysis of animal song, using data from gibbons as a case study, and demonstrate that a superregular analysis can be effectively used with non-human data. A key finding is that a superregular analysis does not increase explanatory power but rather provides for compact analysis. For instance, fewer grammatical rules are necessary once superregularity is allowed. This pattern is analogous to a previous computational analysis of human language, and accordingly, the null hypothesis, that human language and animal song are governed by the same type of grammatical systems, cannot be rejected.


          SDR - half-baked or well done?. (arXiv:1811.02508v1 [cs.SD])      Cache   Translate Page      

Authors: Jonathan Le Roux, Scott Wisdom, Hakan Erdogan, John R. Hershey

In speech enhancement and source separation, signal-to-noise ratio is a ubiquitous objective measure of denoising/separation quality. A decade ago, the BSS_eval toolkit was developed to give researchers worldwide a way to evaluate the quality of their algorithms in a simple, fair, and hopefully insightful way: it attempted to account for channel variations, and to not only evaluate the total distortion in the estimated signal but also split it in terms of various factors such as remaining interference, newly added artifacts, and channel errors. In recent years, hundreds of papers have been relying on this toolkit to evaluate their proposed methods and compare them to previous works, often arguing that differences on the order of 0.1 dB proved the effectiveness of a method over others. We argue here that the signal-to-distortion ratio (SDR) implemented in the BSS_eval toolkit has generally been improperly used and abused, especially in the case of single-channel separation, resulting in misleading results. We propose to use a slightly modified definition, resulting in a simpler, more robust measure, called scale-invariant SDR (SI-SDR). We present various examples of critical failure of the original SDR that SI-SDR overcomes.


          UAlacant machine translation quality estimation at WMT 2018: a simple approach using phrase tables and feed-forward neural networks. (arXiv:1811.02510v1 [cs.CL])      Cache   Translate Page      

Authors: Miquel Esplà-Gomis, Felipe Sánchez-Martínez, Mikel L. Forcada

We describe the Universitat d'Alacant submissions to the word- and sentence-level machine translation (MT) quality estimation (QE) shared task at WMT 2018. Our approach to word-level MT QE builds on previous work to mark the words in the machine-translated sentence as \textit{OK} or \textit{BAD}, and is extended to determine if a word or sequence of words need to be inserted in the gap after each word. Our sentence-level submission simply uses the edit operations predicted by the word-level approach to approximate TER. The method presented ranked first in the sub-task of identifying insertions in gaps for three out of the six datasets, and second in the rest of them.


          Graph Based Power Flow Calculation for Energy Management System. (arXiv:1811.02512v1 [eess.SP])      Cache   Translate Page      

Authors: Junjie Shi, Guangyi Liu, Renchang Dai, Jingjin Wu, Chen Yuan, Zhiwei Wang

Power flow calculation in EMS is required to accommodate a large and complex power system. To achieve a faster than real-time calculation, a graph based power flow calculation is proposed in this paper. Graph database and graph computing advantages in power system calculations are presented. A linear solver for power flow application is formulated and decomposed in nodal parallelism and hierarchical parallelism to fully utilize graph parallel computing capability. Comparison of the algorithm with traditional sequential programs shows significant benefits on computation efficiency. Case studies on practical large-scale systems provide supporting evidence that the new algorithm is promising for online computing for EMS.


          Computing Entity Semantic Similarity by Features Ranking. (arXiv:1811.02516v1 [cs.IR])      Cache   Translate Page      

Authors: Livia Ruback, Claudio Lucchese, Alexander Arturo Mera Caraballo, Grettel Monteagudo García, Marco Antonio Casanova, Chiara Renso

This article presents a novel approach to estimate semantic entity similarity using entity features available as Linked Data. The key idea is to exploit ranked lists of features, extracted from Linked Data sources, as a representation of the entities to be compared. The similarity between two entities is then estimated by comparing their ranked lists of features. The article describes experiments with museum data from DBpedia, with datasets from a LOD catalog, and with computer science conferences from the DBLP repository. The experiments demonstrate that entity similarity, computed using ranked lists of features, achieves better accuracy than state-of-the-art measures.


          NeuralDrop: DNN-based Simulation of Small-Scale Liquid Flows on Solids. (arXiv:1811.02517v1 [cs.GR])      Cache   Translate Page      

Authors: Rajaditya Mukherjee, Qingyang Li, Zhili Chen, Shicheng Chu, Huamin Wang

Small-scale liquid flows on solid surfaces provide convincing details in liquid animation, but they are difficult to be simulated with efficiency and fidelity, mostly due to the complex nature of the surface tension at the contact front where liquid, air, and solid meet. In this paper, we propose to simulate the dynamics of new liquid drops from captured real-world liquid flow data, using deep neural networks. To achieve this goal, we develop a data capture system that acquires liquid flow patterns from hundreds of real-world water drops. We then convert raw data into compact data for training neural networks, in which liquid drops are represented by their contact fronts in a Lagrangian form. Using the LSTM units based on recurrent neural networks, our neural networks serve three purposes in our simulator: predicting the contour of a contact front, predicting the color field gradient of a contact front, and finally predicting whether a contact front is going to break or not. Using these predictions, our simulator recovers the overall shape of a liquid drop at every time step, and handles merging and splitting events by simple operations. The experiment shows that our trained neural networks are able to perform predictions well. The whole simulator is robust, convenient to use, and capable of generating realistic small-scale liquid effects in animation.


          Solving SAT and MaxSAT with a Quantum Annealer: Foundations, Encodings, and Preliminary Results. (arXiv:1811.02524v1 [cs.ET])      Cache   Translate Page      

Authors: Zhengbing Bian, Fabian Chudak, William Macready, Aidan Roy, Roberto Sebastiani, Stefano Varotti

Quantum annealers (QAs) are specialized quantum computers that minimize objective functions over discrete variables by physically exploiting quantum effects. Current QA platforms allow for the optimization of quadratic objectives defined over binary variables (qubits), also known as Ising problems. In the last decade, QA systems as implemented by D-Wave have scaled with Moore-like growth. Current architectures provide 2048 sparsely-connected qubits, and continued exponential growth is anticipated, together with increased connectivity. We explore the feasibility of such architectures for solving SAT and MaxSAT problems as QA systems scale. We develop techniques for effectively encoding SAT -and, with some limitations, MaxSAT- into Ising problems compatible with sparse QA architectures. We provide the theoretical foundations for this mapping, and present encoding techniques that combine offline Satisfiability and Optimization Modulo Theories with on-the-fly placement and routing. Preliminary empirical tests on a current generation 2048-qubit D-Wave system support the feasibility of the approach for certain SAT and MaxSAT problems.


          Double Adaptive Stochastic Gradient Optimization. (arXiv:1811.02525v1 [stat.ML])      Cache   Translate Page      

Authors: Kin Gutierrez, Jin Li, Cristian Challu, Artur Dubrawski

Adaptive moment methods have been remarkably successful in deep learning optimization, particularly in the presence of noisy and/or sparse gradients. We further the advantages of adaptive moment techniques by proposing a family of double adaptive stochastic gradient methods~\textsc{DASGrad}. They leverage the complementary ideas of the adaptive moment algorithms widely used by deep learning community, and recent advances in adaptive probabilistic algorithms.We analyze the theoretical convergence improvements of our approach in a stochastic convex optimization setting, and provide empirical validation of our findings with convex and non convex objectives. We observe that the benefits of~\textsc{DASGrad} increase with the model complexity and variability of the gradients, and we explore the resulting utility in extensions of distribution-matching multitask learning.


          Interactive coding resilient to an unknown number of erasures. (arXiv:1811.02527v1 [cs.DS])      Cache   Translate Page      

Authors: Ran Gelles, Siddharth Iyer

We consider distributed computations between two parties carried out over a noisy channel that may erase messages. Following a noise model proposed by Dani et al. (2018), the noise level observed by the parties during the computation in our setting is arbitrary and a priory unknown to the parties.

We develop interactive coding schemes that adapt to the actual level of noise and correctly execute any two-party computation. Namely, in case the channel erases $T$ transmissions, the coding scheme will take $N+2T$ transmissions (using alphabet of size $4$) to correctly simulate any binary protocol that takes $N$ transmissions assuming a noiseless channel. We can further reduce the communication to $N+T$ if we relax the communication model in a similar way to the adaptive setting of Agrawal et al. (2016), and allow the parties to remain silent rather than transmitting a message in each and every round of the coding scheme.

Our coding schemes are efficient, deterministic, have linear overhead both in their communication and round complexity, and succeed (with probability 1) regardless of the amount of erasures $T$.


          Discriminative training of RNNLMs with the average word error criterion. (arXiv:1811.02528v1 [cs.CL])      Cache   Translate Page      

Authors: Rémi Francis, Tom Ash, Will Williams

In automatic speech recognition (ASR), recurrent neural language models (RNNLM) are typically used to refine hypotheses in the form of lattices or n-best lists, which are generated by a beam search decoder with a weaker language model. The RNNLMs are usually trained generatively using the perplexity (PPL) criterion on large corpora of grammatically correct text. However, the hypotheses are noisy, and the RNNLM doesn't always make the choices that minimise the metric we optimise for, the word error rate (WER). To address this mismatch we propose to use a task specific loss to train an RNNLM to discriminate between multiple hypotheses within lattice rescoring scenario. By fine-tuning the RNNLM on lattices with the average edit distance loss, we show that we obtain a 1.9% relative improvement in word error rate over a purely generatively trained model.


          A Bisimilarity Congruence for the Applied pi-Calculus Sufficiently Coarse to Verify Privacy Properties. (arXiv:1811.02536v1 [cs.CR])      Cache   Translate Page      

Authors: Ross Horne

This paper is the first thorough investigation into the coarsest notion of bisimilarity for the applied pi-calculus that is a congruence relation: open barbed bisimilarity. An open variant of labelled bisimilarity (quasi-open bisimilarity), better suited to constructing bisimulations, is proven to coincide with open barbed bisimilarity. These bisimilary congruences are shown to be characterised by an intuitionistic modal logic that can be used, for example, to describe an attack on privacy whenever a privacy property is violated. Open barbed bisimilarity provides a compositional approach to verifying cryptographic protocols, since properties proven can be reused in any context, including under input prefix. Furthermore, open barbed bisimilarity is sufficiently coarse for reasoning about security and privacy properties of cryptographic protocols; in constrast to the finer bisimilarity congruence, open bisimilarity, which cannot verify certain privacy properties.


          Deep feature transfer between localization and segmentation tasks. (arXiv:1811.02539v1 [cs.CV])      Cache   Translate Page      

Authors: Szu-Yeu Hu, Andrew Beers, Ken Chang, Kathi Höbel, J. Peter Campbell, Deniz Erdogumus, Stratis Ioannidis, Jennifer Dy, Michael F. Chiang, Jayashree Kalpathy-Cramer, James M. Brown

In this paper, we propose a new pre-training scheme for U-net based image segmentation. We first train the encoding arm as a localization network to predict the center of the target, before extending it into a U-net architecture for segmentation. We apply our proposed method to the problem of segmenting the optic disc from fundus photographs. Our work shows that the features learned by encoding arm can be transferred to the segmentation network to reduce the annotation burden. We propose that an approach could have broad utility for medical image segmentation, and alleviate the burden of delineating complex structures by pre-training on annotations that are much easier to acquire.


          Composability of Regret Minimizers. (arXiv:1811.02540v1 [cs.LG])      Cache   Translate Page      

Authors: Gabriele Farina, Christian Kroer, Tuomas Sandholm

Regret minimization is a powerful tool for solving large-scale problems; it was recently used in breakthrough results for large-scale extensive-form-game solving. This was achieved by composing simplex regret minimizers into an overall regret-minimization framework for extensive-form-game strategy spaces. In this paper we study the general composability of regret minimizers. We derive a calculus for constructing regret minimizers for complex convex sets that are constructed from convexity-preserving operations on simpler convex sets. In particular, we show that local regret minimizers for the simpler sets can be composed with additional regret minimizers into an aggregate regret minimizer for the complex set. As an application of our framework we show that the CFR framework can be constructed easily from our framework. We also show how to construct a CFR variant for extensive-form games with strategy constraints. Unlike a recently proposed variant of CFR for strategy constraints by Davis, Waugh, and Bowling (2018), the algorithm resulting from our calculus does not depend on any unknown constants and thus avoids binary search.


          Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond. (arXiv:1811.02545v1 [cs.CV])      Cache   Translate Page      

Authors: Krishna Kumar Singh, Hao Yu, Aron Sarmasi, Gautam Pradeep, Yong Jae Lee

We propose 'Hide-and-Seek' a general purpose data augmentation technique, which is complementary to existing data augmentation techniques and is beneficial for various visual recognition tasks. The key idea is to hide patches in a training image randomly, in order to force the network to seek other relevant content when the most discriminative content is hidden. Our approach only needs to modify the input image and can work with any network to improve its performance. During testing, it does not need to hide any patches. The main advantage of Hide-and-Seek over existing data augmentation techniques is its ability to improve object localization accuracy in the weakly-supervised setting, and we therefore use this task to motivate the approach. However, Hide-and-Seek is not tied only to the image localization task, and can generalize to other forms of visual input like videos, as well as other recognition tasks like image classification, temporal action localization, semantic segmentation, emotion recognition, age/gender estimation, and person re-identification. We perform extensive experiments to showcase the advantage of Hide-and-Seek on these various visual recognition problems.


          A Model for General Intelligence. (arXiv:1811.02546v1 [cs.AI])      Cache   Translate Page      

Authors: Paul Yaworsky

The overarching problem in artificial intelligence (AI) is that we do not understand the intelligence process well enough to enable the development of adequate computational models. Much work has been done in AI over the years at lower levels, but a big part of what has been missing involves the high level, abstract, general nature of intelligence. We address this gap by developing a model for general intelligence. To accomplish this, we focus on three basic aspects of intelligence. First, we must realize the general order and nature of intelligence at a high level. Second, we must come to know what these realizations mean with respect to the overall intelligence process. Third, we must describe these realizations as clearly as possible. We propose a hierarchical model to help capture and exploit the order within intelligence. The underlying order involves patterns of signals that become organized, stored and activated in space and time. These patterns can be described using a simple, general hierarchy, with physical signals at the lowest level, information in the middle, and abstract signal representations at the top. This high level perspective provides a big picture that literally helps us see the intelligence process, thereby enabling fundamental realizations, a better understanding and clear descriptions of the intelligence process. The resulting model can be used to support all kinds of information processing across multiple levels of abstraction. As computer technology improves, and as cooperation increases between humans and computers, people will become more efficient and more productive in performing their information processing tasks.


          Language GANs Falling Short. (arXiv:1811.02549v1 [cs.CL])      Cache   Translate Page      

Authors: Massimo Caccia, Lucas Caccia, William Fedus, Hugo Larochelle, Joelle Pineau, Laurent Charlin

Generating high-quality text with sufficient diversity is essential for a wide range of Natural Language Generation (NLG) tasks. Maximum-Likelihood (MLE) models trained with teacher forcing have constantly been reported as weak baselines, where poor performance is attributed to exposure bias; at inference time, the model is fed its own prediction instead of a ground-truth token, which can lead to accumulating errors and poor samples. This line of reasoning has led to an outbreak of adversarial based approaches for NLG, on the account that GANs do not suffer from exposure bias. In this work, wake make several surprising observations with contradict common beliefs. We first revisit the canonical evaluation framework for NLG, and point out fundamental flaws with quality-only evaluation: we show that one can outperform such metrics using a simple, well-known temperature parameter to artificially reduce the entropy of the model's conditional distributions. Second, we leverage the control over the quality / diversity tradeoff given by this parameter to evaluate models over the whole quality-diversity spectrum, and find MLE models constantly outperform the proposed GAN variants, over the whole quality-diversity space. Our results have several implications: 1) The impact of exposure bias on sample quality is less severe than previously thought, 2) temperature tuning provides a better quality / diversity trade off than adversarial training, while being easier to train, easier to cross-validate, and less computationally expensive.


          Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms?. (arXiv:1811.02553v1 [cs.LG])      Cache   Translate Page      

Authors: Andrew Ilyas, Logan Engstrom, Shibani Santurkar, Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

We study how the behavior of deep policy gradient algorithms reflects the conceptual framework motivating their development. We propose a fine-grained analysis of state-of-the-art methods based on key aspects of this framework: gradient estimation, value prediction, optimization landscapes, and trust region enforcement. We find that from this perspective, the behavior of deep policy gradient algorithms often deviates from what their motivating framework would predict. Our analysis suggests first steps towards solidifying the foundations of these algorithms, and in particular indicates that we may need to move beyond the current benchmark-centric evaluation methodology.


          Quantizers with Parameterized Distortion Measures. (arXiv:1811.02554v1 [cs.IT])      Cache   Translate Page      

Authors: Jun Guo, Philipp Walk, Hamid Jafarkhani

In many quantization problems, the distortion function is given by the Euclidean metric to measure the distance of a source sample to any given reproduction point of the quantizer. We will in this work regard distortion functions, which are additively and multiplicatively weighted for each reproduction point resulting in a heterogeneous quantization problem, as used for example in deployment problems of sensor networks. Whereas, normally in such problems, the average distortion is minimized for given weights (parameters), we will optimize the quantization problem over all weights, i.e., we tune or control the distortion functions in our favor.

For a uniform source distribution in one-dimension, we derive the unique minimizer, given as the uniform scalar quantizer with an optimal common weight. By numerical simulations, we demonstrate that this result extends to two-dimensions where asymptotically the parameter optimized quantizer is the hexagonal lattice with common weights. As an application, we will determine the optimal deployment of unmanned aerial vehicles (UAVs) to provide a wireless communication to ground terminals under a minimal communication power cost. Here, the optimal weights relate to the optimal flight heights of the UAVs.


          Optimal Number of Choices in Rating Contexts. (arXiv:1605.06588v8 [cs.AI] UPDATED)      Cache   Translate Page      

Authors: Sam Ganzfried, Farzana Yusuf

In many settings people must give numerical scores to entities from a small discrete set. For instance, rating physical attractiveness from 1--5 on dating sites, or papers from 1--10 for conference reviewing. We study the problem of understanding when using a different number of options is optimal. We consider the case when scores are uniform random and Gaussian. We study computationally when using 2, 3, 4, 5, and 10 options out of a total of 100 is optimal in these models (though our theoretical analysis is for a more general setting with $k$ choices from $n$ total options as well as a continuous underlying space). One may expect that using more options would always improve performance in this model, but we show that this is not necessarily the case, and that using fewer choices---even just two---can surprisingly be optimal in certain situations. While in theory for this setting it would be optimal to use all 100 options, in practice this is prohibitive, and it is preferable to utilize a smaller number of options due to humans' limited computational resources. Our results could have many potential applications, as settings requiring entities to be ranked by humans are ubiquitous. There could also be applications to other fields such as signal or image processing where input values from a large set must be mapped to output values in a smaller set.


          From Perception to Decision: A Data-driven Approach to End-to-end Motion Planning for Autonomous Ground Robots. (arXiv:1609.07910v3 [cs.RO] UPDATED)      Cache   Translate Page      

Authors: Mark Pfeiffer, Michael Schaeuble, Juan Nieto, Roland Siegwart, Cesar Cadena

Learning from demonstration for motion planning is an ongoing research topic. In this paper we present a model that is able to learn the complex mapping from raw 2D-laser range findings and a target position to the required steering commands for the robot. To our best knowledge, this work presents the first approach that learns a target-oriented end-to-end navigation model for a robotic platform. The supervised model training is based on expert demonstrations generated in simulation with an existing motion planner. We demonstrate that the learned navigation model is directly transferable to previously unseen virtual and, more interestingly, real-world environments. It can safely navigate the robot through obstacle-cluttered environments to reach the provided targets. We present an extensive qualitative and quantitative evaluation of the neural network-based motion planner, and compare it to a grid-based global approach, both in simulation and in real-world experiments.


          A Continuous Model of Cortical Connectivity. (arXiv:1610.03809v2 [q-bio.NC] UPDATED)      Cache   Translate Page      

Authors: Daniel Moyer, Boris A. Gutman, Joshua Faskowitz, Neda Jahanshad, Paul M. Thompson

We present a continuous model for structural brain connectivity based on the Poisson point process. The model treats each streamline curve in a tractography as an observed event in connectome space, here a product space of cortical white matter boundaries. We approximate the model parameter via kernel density estimation. To deal with the heavy computational burden, we develop a fast parameter estimation method by pre-computing associated Legendre products of the data, leveraging properties of the spherical heat kernel. We show how our approach can be used to assess the quality of cortical parcellations with respect to connectivty. We further present empirical results that suggest the discrete connectomes derived from our model have substantially higher test-retest reliability compared to standard methods.


          Uniform continuity bounds for information characteristics of quantum channels depending on input dimension and on input energy. (arXiv:1610.08870v5 [quant-ph] UPDATED)      Cache   Translate Page      

Authors: M.E. Shirokov

We obtain continuity bounds for basic information characteristics of quantum channels depending on their input dimension (if it is finite) and on the input energy bound (if the input dimension is infinite). We pay a special attention to the case of a multi-mode quantum oscillator as an input system.

First we prove continuity bounds for the output conditional mutual information for a single channel and for $n$ copies of a channel.

Then we obtain estimates for variation of the output Holevo quantity with respect to simultaneous variations of a channel and of an input ensemble.

As a result tight and close-to-tight continuity bounds for basic capacities of quantum channels depending on the input dimension are obtained. They complement the Leung-Smith continuity bounds depending on the output dimension.

Finally, we obtain tight and close-to-tight continuity bounds for basic capacities of infinite-dimensional energy-constrained channels with respect to the energy-constrained Bures distance generating the strong convergence of quantum channels.


          Frames and numerical approximation. (arXiv:1612.04464v4 [math.NA] UPDATED)      Cache   Translate Page      

Authors: Ben Adcock, Daan Huybrechs

Functions of one or more variables are usually approximated with a basis: a complete, linearly-independent system of functions that spans a suitable function space. The topic of this paper is the numerical approximation of functions using the more general notion of frames: that is, complete systems that are generally redundant but provide infinite representations with bounded coefficients. While frames are well-known in image and signal processing, coding theory and other areas of applied mathematics, their use in numerical analysis is far less widespread. Yet, as we show via a series of examples, frames are more flexible than bases, and can be constructed easily in a range of problems where finding orthonormal bases with desirable properties (rapid convergence, high resolution power, etc.) is difficult or impossible.

A key concern when using frames is that computing a best approximation requires solving an ill-conditioned linear system. Nonetheless, we construct a frame approximation via regularization with bounded condition number (with respect to perturbations in the data), and which approximates any function up to an error of order $\sqrt{\epsilon}$, or even of order $\epsilon$ with suitable modifications. Here $\epsilon$ is a threshold value that can be chosen by the user. Crucially, rate of decay of the error down to this level is determined by the existence of approximate representations of $f$ in the frame possessing small-norm coefficients. We demonstrate the existence of such representations in all of our examples. Overall, our analysis suggests that frames are a natural generalization of bases in which to develop numerical approximation. In particular, even in the presence of severely ill-conditioned linear systems, the frame condition imposes sufficient mathematical structure in order to give rise to accurate, well-conditioned approximations.


          A $(2+\epsilon)$-Approximation for Maximum Weight Matching in the Semi-Streaming Model. (arXiv:1702.04536v2 [cs.DS] UPDATED)      Cache   Translate Page      

Authors: Ami Paz, Gregory Schwartzman

We present a simple deterministic single-pass $(2+\epsilon)$-approximation algorithm for the maximum weight matching problem in the semi-streaming model. This improves upon the currently best known approximation ratio of $(4+\epsilon)$.

Our algorithm uses $O(n\log^2 n)$ bits of space for constant values of $\epsilon$. It relies on a variation of the local-ratio theorem, which may be of use for other algorithms in the semi-streaming model as well.


          Distribution System Voltage Control under Uncertainties using Tractable Chance Constraints. (arXiv:1704.08999v4 [math.OC] UPDATED)      Cache   Translate Page      

Authors: Pan Li, Baihong Jin, Dai Wang, Baosen Zhang

Voltage control plays an important role in the operation of electricity distribution networks, especially with high penetration of distributed energy resources. These resources introduce significant and fast varying uncertainties. In this paper, we focus on reactive power compensation to control voltage in the presence of uncertainties. We adopt a chance constraint approach that accounts for arbitrary correlations between renewable resources at each of the buses. We show how the problem can be solved efficiently using historical samples via a stochastic quasi gradient method. We also show that this optimization problem is convex for a wide variety of probabilistic distributions. Compared to conventional per-bus chance constraints, our formulation is more robust to uncertainty and more computationally tractable. We illustrate the results using standard IEEE distribution test feeders.


          Distributed Simultaneous Action and Target Assignment for Multi-Robot Multi-Target Tracking. (arXiv:1706.02245v2 [cs.RO] UPDATED)      Cache   Translate Page      

Authors: Yoonchang Sung, Ashish Kumar Budhiraja, Ryan K. Williams, Pratap Tokekar

We study a multi-robot assignment problem for multi-target tracking. The proposed problem can be viewed as the mixed packing and covering problem. To deal with a limitation on both sensing and communication ranges, a distributed approach is taken into consideration. A local algorithm gives theoretical bounds on both the running time and approximation ratio to an optimal solution. We employ a local algorithm of max-min linear programs to solve the proposed task. Simulation result shows that a local algorithm is an effective solution to the multi-robot task allocation.


          Interpretation of Neural Networks is Fragile. (arXiv:1710.10547v2 [stat.ML] UPDATED)      Cache   Translate Page      

Authors: Amirata Ghorbani, Abubakar Abid, James Zou

In order for machine learning to be deployed and trusted in many applications, it is crucial to be able to reliably explain why the machine learning algorithm makes certain predictions. For example, if an algorithm classifies a given pathology image to be a malignant tumor, then the doctor may need to know which parts of the image led the algorithm to this classification. How to interpret black-box predictors is thus an important and active area of research. A fundamental question is: how much can we trust the interpretation itself? In this paper, we show that interpretation of deep learning predictions is extremely fragile in the following sense: two perceptively indistinguishable inputs with the same predicted label can be assigned very different interpretations. We systematically characterize the fragility of several widely-used feature-importance interpretation methods (saliency maps, relevance propagation, and DeepLIFT) on ImageNet and CIFAR-10. Our experiments show that even small random perturbation can change the feature importance and new systematic perturbations can lead to dramatically different interpretations without changing the label. We extend these results to show that interpretations based on exemplars (e.g. influence functions) are similarly fragile. Our analysis of the geometry of the Hessian matrix gives insight on why fragility could be a fundamental challenge to the current interpretation approaches.


          Machine Learning for Set-Identified Linear Models. (arXiv:1712.10024v2 [stat.ML] UPDATED)      Cache   Translate Page      

Authors: Vira Semenova

Set-identified models often restrict the number of covariates leading to wide identified sets in practice. This paper provides estimation and inference methods for set-identified linear models with high-dimensional covariates where the model selection is based on modern machine learning tools. I characterize the boundary (i.e, support function) of the identified set using a semiparametric moment condition. Combining Neyman-orthogonality and sample splitting ideas, I construct a root-N consistent, the uniformly asymptotically Gaussian estimator of the support function. I also prove the validity of the Bayesian bootstrap procedure to conduct inference about the identified set. I provide a general method to construct a Neyman-orthogonal moment condition for the support function. I apply this result to estimate sharp nonparametric bounds on the average treatment effect in Lee (2008)'s model of endogenous selection and substantially tighten the bounds on this parameter in Angrist et al. (2006)'s empirical setting. I also apply this result to estimate sharp identified sets for two other parameters - a new parameter, called a partially linear predictor, and the average partial derivative when the outcome variable is recorded in intervals.


          Asymptotically Optimal Scheduling for Compute-and-Forward. (arXiv:1801.03259v6 [cs.IT] UPDATED)      Cache   Translate Page      

Authors: Ori Shmuel, Asaf Cohen, Omer Gurewitz

Consider a Compute and Forward (CF) relay network with $L$ users and a single relay. The relay tries to decode a linear function of the transmitted signals. For such a network, letting all $L$ users transmit simultaneously, especially when $L$ is large, causes a significant degradation in the rate in which the relay is able to decode. In fact, the rate goes to zero very fast with $L$. Therefore, in each transmission phase only a fixed number of users should transmit, i.e., users should be scheduled.

In this work, we examine the problem of scheduling for CF and lay the foundations for identifying the optimal schedule which, to date, lacks a clear understanding. Specifically, we start with insights why when the number of users is large, good scheduling opportunities can be found. Then, we provide an asymptotically optimal, polynomial time scheduling algorithm and analyze it's performance. We conclude that scheduling under CF provides a gain in the system sum-rate, up to the optimal scaling law of $O(\log{\log{L}})$.


          New Perspectives on Zero-Knowledge Multi-Prover Interactive Proofs. (arXiv:1801.04598v3 [quant-ph] UPDATED)      Cache   Translate Page      

Authors: Claude Crépeau, Nan Yang

In multi-prover interactive proofs (MIPs), the verifier can provide non-local resources for the provers intrinsically. In most cases, this is undesirable. Existing proofs of soundness do not account for the verifier's non-local potential. We show that this may be a problem for many MIPs. We provide a solution by constructing a generalization of the MIP model, of which standard MIPs are a special case. This new model accounts for both the prover and the verifier's non-local correlations. A new property of multi-prover zero-knowledge naturally emerges as a result.


          Belief Control Strategies for Interactions over Weakly-Connected Graphs. (arXiv:1801.05479v2 [cs.MA] UPDATED)      Cache   Translate Page      

Authors: Hawraa Salami, Bicheng Ying, Ali H. Sayed

In diffusion social learning over weakly-connected graphs, it has been shown recently that influential agents shape the beliefs of non-influential agents. This paper analyzes this mechanism more closely and addresses two main questions. First, the article examines how much freedom influential agents have in controlling the beliefs of the receiving agents, namely, whether receiving agents can be driven to arbitrary beliefs and whether the network structure limits the scope of control by the influential agents. Second, even if there is a limit to what influential agents can accomplish, this article develops mechanisms by which they can lead receiving agents to adopt certain beliefs. These questions raise interesting possibilities about belief control over networked agents. Once addressed, one ends up with design procedures that allow influential agents to drive other agents to endorse particular beliefs regardless of their local observations or convictions. The theoretical findings are illustrated by means of examples.


          A Signature-based Algorithm for computing Computing Gr\"obner Bases over Principal Ideal Domains. (arXiv:1802.01388v2 [cs.SC] UPDATED)      Cache   Translate Page      

Authors: Maria Francis, Thibaut Verron

Signature-based algorithms have become a standard approach for Gr\"obner basis computations for polynomial systems over fields, but how to extend these techniques to coefficients in general rings is not yet as well understood.

In this paper, we present a proof-of-concept signature-based algorithm for computing Gr\"obner bases over commutative integral domains. It is adapted from a general version of M\"oller's algorithm (1988) which considers reductions by multiple polynomials at each step. This algorithm performs reductions with non-decreasing signatures, and in particular, signature drops do not occur. When the coefficients are from a principal ideal domain (e.g. the ring of integers or the ring of univariate polynomials over a field), we prove correctness and termination of the algorithm, and we show how to use signature properties to implement classic signature-based criteria to eliminate some redundant reductions. In particular, if the input is a regular sequence, the algorithm operates without any reduction to 0.

We have written a toy implementation of the algorithm in Magma. Early experimental results suggest that the algorithm might even be correct and terminate in a more general setting, for polynomials over a unique factorization domain (e.g. the ring of multivariate polynomials over a field or a PID).


          Stronger generalization bounds for deep nets via a compression approach. (arXiv:1802.05296v3 [cs.LG] UPDATED)      Cache   Translate Page      

Authors: Sanjeev Arora, Rong Ge, Behnam Neyshabur, Yi Zhang

Deep nets generalize well despite having more parameters than the number of training samples. Recent works try to give an explanation using PAC-Bayes and Margin-based analyses, but do not as yet result in sample complexity bounds better than naive parameter counting. The current paper shows generalization bounds that're orders of magnitude better in practice. These rely upon new succinct reparametrizations of the trained net --- a compression that is explicit and efficient. These yield generalization bounds via a simple compression-based framework introduced here. Our results also provide some theoretical justification for widespread empirical success in compressing deep nets. Analysis of correctness of our compression relies upon some newly identified \textquotedblleft noise stability\textquotedblright properties of trained deep nets, which are also experimentally verified. The study of these properties and resulting generalization bounds are also extended to convolutional nets, which had eluded earlier attempts on proving generalization.


          A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks. (arXiv:1802.06153v2 [cs.LG] UPDATED)      Cache   Translate Page      

Authors: Jeffrey Chan, Valerio Perrone, Jeffrey P. Spence, Paul A. Jenkins, Sara Mathieson, Yun S. Song

An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential challenges need to be addressed: (1) population data are exchangeable, calling for methods that efficiently exploit the symmetries of the data, and (2) computing likelihoods is intractable as it requires integrating over a set of correlated, extremely high-dimensional latent variables. These challenges are traditionally tackled by likelihood-free methods that use scientific simulators to generate datasets and reduce them to hand-designed, permutation-invariant summary statistics, often leading to inaccurate inference. In this work, we develop an exchangeable neural network that performs summary statistic-free, likelihood-free inference. Our framework can be applied in a black-box fashion across a variety of simulation-based tasks, both within and outside biology. We demonstrate the power of our approach on the recombination hotspot testing problem, outperforming the state-of-the-art.


          Semi-Supervised Algorithms for Approximately Optimal and Accurate Clustering. (arXiv:1803.00926v2 [cs.DS] UPDATED)      Cache   Translate Page      

Authors: Buddhima Gamlath, Sangxia Huang, Ola Svensson

We study $k$-means clustering in a semi-supervised setting. Given an oracle that returns whether two given points belong to the same cluster in a fixed optimal clustering, we investigate the following question: how many oracle queries are sufficient to efficiently recover a clustering that, with probability at least $(1 - \delta)$, simultaneously has a cost of at most $(1 + \epsilon)$ times the optimal cost and an accuracy of at least $(1 - \epsilon)$?

We show how to achieve such a clustering on $n$ points with $O{((k^2 \log n) \cdot m{(Q, \epsilon^4, \delta / (k\log n))})}$ oracle queries, when the $k$ clusters can be learned with an $\epsilon'$ error and a failure probability $\delta'$ using $m(Q, \epsilon',\delta')$ labeled samples in the supervised setting, where $Q$ is the set of candidate cluster centers. We show that $m(Q, \epsilon', \delta')$ is small both for $k$-means instances in Euclidean space and for those in finite metric spaces. We further show that, for the Euclidean $k$-means instances, we can avoid the dependency on $n$ in the query complexity at the expense of an increased dependency on $k$: specifically, we give a slightly more involved algorithm that uses $O(k^4/(\epsilon^2 \delta) + (k^{9}/\epsilon^4) \log(1/\delta) + k \cdot m(\mathbb{R}^r, \epsilon^4/k, \delta))$ oracle queries.

We also show that the number of queries needed for $(1 - \epsilon)$-accuracy in Euclidean $k$-means must linearly depend on the dimension of the underlying Euclidean space, and for finite metric space $k$-means, we show that it must at least be logarithmic in the number of candidate centers. This shows that our query complexities capture the right dependencies on the respective parameters.


          Artificial Intelligence Enabled Software Defined Networking: A Comprehensive Overview. (arXiv:1803.06818v3 [cs.AI] UPDATED)      Cache   Translate Page      

Authors: Majd Latah, Levent Toker

Software defined networking (SDN) represents a promising networking architecture that combines central management and network programmability. SDN separates the control plane from the data plane and moves the network management to a central point, called the controller, that can be programmed and used as the brain of the network. Recently, the research community has showed an increased tendency to benefit from the recent advancements in the artificial intelligence (AI) field to provide learning abilities and better decision making in SDN. In this study, we provide a detailed overview of the recent efforts to include AI in SDN. Our study showed that the research efforts focused on three main sub-fields of AI namely: machine learning, meta-heuristics and fuzzy inference systems. Accordingly, in this work we investigate their different application areas and potential use, as well as the improvements achieved by including AI-based techniques in the SDN paradigm.


          Reservoir computing approaches for representation and classification of multivariate time series. (arXiv:1803.07870v2 [cs.NE] UPDATED)      Cache   Translate Page      

Authors: Filippo Maria Bianchi, Simone Scardapane, Sigurd Løkse, Robert Jenssen

Classification of multivariate time series (MTS) has been tackled with a large variety of methodologies and applied to a wide range of scenarios. Among the existing approaches, reservoir computing (RC) techniques, which implement a fixed and high-dimensional recurrent network to process sequential data, are computationally efficient tools to generate a vectorial, fixed-size representation of the MTS that can be further processed by standard classifiers. Despite their unrivaled training speed, MTS classifiers based on a standard RC architecture fail to achieve the same accuracy of other classifiers, such as those exploiting fully trainable recurrent networks. In this paper we introduce the reservoir model space, an RC approach to learn vectorial representations of MTS in an unsupervised fashion. Each MTS is encoded within the parameters of a linear model trained to predict a low-dimensional embedding of the reservoir dynamics. Our model space yields a powerful representation of the MTS and, thanks to an intermediate dimensionality reduction procedure, attains computational performance comparable to other RC methods. As a second contribution we propose a modular RC framework for MTS classification, with an associated open source Python library. By combining the different modules it is possible to seamlessly implement advanced RC architectures, including our proposed unsupervised representation, bidirectional reservoirs, and non-linear readouts, such as deep neural networks with both fixed and flexible activation functions. Results obtained on benchmark and real-world MTS datasets show that RC classifiers are dramatically faster and, when implemented using our proposed representation, also achieve superior classification accuracy.


          Verifier Non-Locality in Interactive Proofs. (arXiv:1804.02724v3 [quant-ph] UPDATED)      Cache   Translate Page      

Authors: Claude Crépeau, Nan Yang

In multi-prover interactive proofs, the verifier interrogates the provers and attempts to steal their knowledge. Other than that, the verifier's role has not been studied. Augmentation of the provers with non-local resources results in classes of languages that may not be NEXP. We have discovered that the verifier plays a much more important role than previously thought. Simply put, the verifier has the capability of providing non-local resources for the provers intrinsically. Therefore, standard MIPs may already contain protocols equivalent to one in which the prover is augmented non-locally. Existing MIPs' proofs of soundness implicitly depend on the fact that the verifier is not a non-local resource provider. The verifier's non-locality is a new unused tool and liability for protocol design and analysis. Great care should have been taken when claiming that ZKMIP=MIP. We show specific issues with existing protocols and revisit the proof of this statement. For this purpose, we also define a new model of multi-prover interactive proofs which we call "correlational confinement form".


          Edge-sum distinguishing labeling. (arXiv:1804.05411v3 [math.CO] UPDATED)      Cache   Translate Page      

Authors: Jan Bok, Nikola Jedličková

We study \emph{edge-sum distinguishing labeling}, a type of labeling recently introduced by Tuza in [Zs. Tuza, \textit{Electronic Notes in Discrete Mathematics} 60, (2017), 61-68] in context of labeling games.

An \emph{ESD labeling} of an $n$-vertex graph $G$ is an injective mapping of integers $1$ to $l$ to its vertices such that for every edge, the sum of the integers on its endpoints is unique. If $l$ equals to $n$, we speak about a \emph{canonical ESD labeling}.

We focus primarily on structural properties of this labeling and show for several classes of graphs if they have or do not have a canonical ESD labeling. As an application we show some implications of these results for games based on ESD labeling. We also observe that ESD labeling is closely connected to the well-known notion of the \emph{Sidon sequence} and \emph{harmonious labeling}.


          Graph Bayesian Optimization: Algorithms, Evaluations and Applications. (arXiv:1805.01157v4 [stat.ML] UPDATED)      Cache   Translate Page      

Authors: Jiaxu Cui, Bo Yang

Network structure optimization is a fundamental task in complex network analysis. However, almost all the research on Bayesian optimization is aimed at optimizing the objective functions with vectorial inputs. In this work, we first present a flexible framework, denoted graph Bayesian optimization, to handle arbitrary graphs in the Bayesian optimization community. By combining the proposed framework with graph kernels, it can take full advantage of implicit graph structural features to supplement explicit features guessed according to the experience, such as tags of nodes and any attributes of graphs. The proposed framework can identify which features are more important during the optimization process. We apply the framework to solve four problems including two evaluations and two applications to demonstrate its efficacy and potential applications.


          Semi-Orthogonal Non-Negative Matrix Factorization. (arXiv:1805.02306v2 [stat.ME] UPDATED)      Cache   Translate Page      

Authors: Jack Yutong Li, Ruoqing Zhu, Annie Qu, Han Ye, Zhankun Sun

Non-negative Matrix Factorization (NMF) is a popular clustering and dimension reduction method by decomposing a non-negative matrix into the product of two lower dimension matrices composed of basis vectors. In this paper, we propose semi-orthogonal NMF, a novel method that enforces one of the matrices to be orthogonal with mixed signs. Our method preserves strict orthogonality by implementing the Cayley transformation to force the solution path to be exactly on the Stiefel manifold, as opposed to the approximated orthogonality solutions in existing literature. We apply a line search update scheme along with an SVD-based initialization which produces a rapid convergence of the algorithm compared to other existing approaches. In addition, we present formulations of our method to incorporate both continuous and binary design matrices. Through various simulation studies, we show that our model has an advantage over other NMF variations regarding the accuracy of the factorization, rate of convergence, and the degree of orthogonality while being computationally competitive. We also apply our method to a text mining application on classifying triage notes and show the effectiveness of our model in increasing classification accuracy compared to the conventional bag-of-words model and other alternative matrix factorization approaches. The proposed methods, along with a variety of existing non-negative matrix factorization approaches, are implemented in the R package `MatrixFact', which is available on GitHub.


          geomstats: a Python Package for Riemannian Geometry in Machine Learning. (arXiv:1805.08308v2 [cs.LG] UPDATED)      Cache   Translate Page      

Authors: Nina Miolane, Johan Mathe, Claire Donnat, Mikael Jorda, Xavier Pennec

We introduce geomstats, a python package that performs computations on manifolds such as hyperspheres, hyperbolic spaces, spaces of symmetric positive definite matrices and Lie groups of transformations. We provide efficient and extensively unit-tested implementations of these manifolds, together with useful Riemannian metrics and associated Exponential and Logarithm maps. The corresponding geodesic distances provide a range of intuitive choices of Machine Learning loss functions. We also give the corresponding Riemannian gradients. The operations implemented in geomstats are available with different computing backends such as numpy, tensorflow and keras. We have enabled GPU implementation and integrated geomstats manifold computations into keras deep learning framework. This paper also presents a review of manifolds in machine learning and an overview of the geomstats package with examples demonstrating its use for efficient and user-friendly Riemannian geometry.


          Deep Reinforcement Learning of Marked Temporal Point Processes. (arXiv:1805.09360v2 [cs.LG] UPDATED)      Cache   Translate Page      

Authors: Utkarsh Upadhyay, Abir De, Manuel Gomez-Rodriguez

In a wide variety of applications, humans interact with a complex environment by means of asynchronous stochastic discrete events in continuous time. Can we design online interventions that will help humans achieve certain goals in such asynchronous setting? In this paper, we address the above problem from the perspective of deep reinforcement learning of marked temporal point processes, where both the actions taken by an agent and the feedback it receives from the environment are asynchronous stochastic discrete events characterized using marked temporal point processes. In doing so, we define the agent's policy using the intensity and mark distribution of the corresponding process and then derive a flexible policy gradient method, which embeds the agent's actions and the feedback it receives into real-valued vectors using deep recurrent neural networks. Our method does not make any assumptions on the functional form of the intensity and mark distribution of the feedback and it allows for arbitrarily complex reward functions. We apply our methodology to two different applications in personalized teaching and viral marketing and, using data gathered from Duolingo and Twitter, we show that it may be able to find interventions to help learners and marketers achieve their goals more effectively than alternatives.


          Towards Robust Evaluations of Continual Learning. (arXiv:1805.09733v2 [stat.ML] UPDATED)      Cache   Translate Page      

Authors: Sebastian Farquhar, Yarin Gal

The experiments used in current continual learning research do not faithfully assess fundamental challenges of learning continually. We examine standard evaluations and show why these evaluations make some types of continual learning approaches look better than they are. In particular, current evaluations are biased towards continual learning approaches that treat previous models as a prior (e.g., EWC, VCL). We introduce desiderata for continual learning evaluations and explain why their absence creates misleading comparisons. Our analysis calls for a reprioritization of research effort by the community.


          TADAM: Task dependent adaptive metric for improved few-shot learning. (arXiv:1805.10123v3 [cs.LG] UPDATED)      Cache   Translate Page      

Authors: Boris N. Oreshkin, Pau Rodriguez, Alexandre Lacoste

Few-shot learning has become essential for producing models that generalize from few examples. In this work, we identify that metric scaling and metric task conditioning are important to improve the performance of few-shot algorithms. Our analysis reveals that simple metric scaling completely changes the nature of few-shot algorithm parameter updates. Metric scaling provides improvements up to 14% in accuracy for certain metrics on the mini-Imagenet 5-way 5-shot classification task. We further propose a simple and effective way of conditioning a learner on the task sample set, resulting in learning a task-dependent metric space. Moreover, we propose and empirically test a practical end-to-end optimization procedure based on auxiliary task co-training to learn a task-dependent metric space. The resulting few-shot learning model based on the task-dependent scaled metric achieves state of the art on mini-Imagenet. We confirm these results on another few-shot dataset that we introduce in this paper based on CIFAR100.


          Learning Restricted Boltzmann Machines via Influence Maximization. (arXiv:1805.10262v2 [cs.LG] UPDATED)      Cache   Translate Page      

Authors: Guy Bresler, Frederic Koehler, Ankur Moitra, Elchanan Mossel

Graphical models are a rich language for describing high-dimensional distributions in terms of their dependence structure. While there are algorithms with provable guarantees for learning undirected graphical models in a variety of settings, there has been much less progress in the important scenario when there are latent variables. Here we study Restricted Boltzmann Machines (or RBMs), which are a popular model with wide-ranging applications in dimensionality reduction, collaborative filtering, topic modeling, feature extraction and deep learning.

The main message of our paper is a strong dichotomy in the feasibility of learning RBMs, depending on the nature of the interactions between variables: ferromagnetic models can be learned efficiently, while general models cannot. In particular, we give a simple greedy algorithm based on influence maximization to learn ferromagnetic RBMs with bounded degree. In fact, we learn a description of the distribution on the observed variables as a Markov Random Field. Our analysis is based on tools from mathematical physics that were developed to show the concavity of magnetization. Our algorithm extends straighforwardly to general ferromagnetic Ising models with latent variables.

Conversely, we show that even for a contant number of latent variables with constant degree, without ferromagneticity the problem is as hard as sparse parity with noise. This hardness result is based on a sharp and surprising characterization of the representational power of bounded degree RBMs: the distribution on their observed variables can simulate any bounded order MRF. This result is of independent interest since RBMs are the building blocks of deep belief networks.


          Generalizing to Unseen Domains via Adversarial Data Augmentation. (arXiv:1805.12018v2 [cs.CV] UPDATED)      Cache   Translate Page      

Authors: Riccardo Volpi, Hongseok Namkoong, Ozan Sener, John Duchi, Vittorio Murino, Silvio Savarese

We are concerned with learning models that generalize well to different \emph{unseen} domains. We consider a worst-case formulation over data distributions that are near the source domain in the feature space. Only using training data from a single source distribution, we propose an iterative procedure that augments the dataset with examples from a fictitious target domain that is "hard" under the current model. We show that our iterative scheme is an adaptive data augmentation method where we append adversarial examples at each iteration. For softmax losses, we show that our method is a data-dependent regularization scheme that behaves differently from classical regularizers that regularize towards zero (e.g., ridge or lasso). On digit recognition and semantic segmentation tasks, our method learns models improve performance across a range of a priori unknown target domains.


          A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation. (arXiv:1806.02450v2 [cs.LG] UPDATED)      Cache   Translate Page      

Authors: Jalaj Bhandari, Daniel Russo, Raghav Singal

Temporal difference learning (TD) is a simple iterative algorithm used to estimate the value function corresponding to a given policy in a Markov decision process. Although TD is one of the most widely used algorithms in reinforcement learning, its theoretical analysis has proved challenging and few guarantees on its statistical efficiency are available. In this work, we provide a simple and explicit finite time analysis of temporal difference learning with linear function approximation. Except for a few key insights, our analysis mirrors standard techniques for analyzing stochastic gradient descent algorithms, and therefore inherits the simplicity and elegance of that literature. Final sections of the paper show how all of our main results extend to the study of TD learning with eligibility traces, known as TD($\lambda$), and to Q-learning applied in high-dimensional optimal stopping problems.


          A Computational Approach to Organizational Structure. (arXiv:1806.05701v3 [cs.DS] UPDATED)      Cache   Translate Page      

Authors: Bernhard Haeupler, D Ellis Hershkowitz, Anson Kahng, Ariel D. Procaccia

An organizational structure defines how an organization arranges and manages its individuals. Because individuals in an organization take time to both perform tasks and communicate, organizational structure must take both computation and communication into account. Sociologists have studied organizational structure from an empirical perspective for decades, but their work has been limited by small sample sizes. By contrast, we study organizational structure from a computational and theoretical perspective.

Specifically, we introduce a model of organizations that involves both computation and communication, and captures the spirit of past sociology experiments. In our model, each node in a graph starts with a token, and at any time can either perform computation to merge two tokens in $t_c$ time, or perform communication by sending a token to a neighbor in $t_m$ time. We study how to schedule computations and communications so as to merge all tokens as quickly as possible.

As our first result, we give a polynomial-time algorithm that optimally solves this problem on a complete graph. This result characterizes the optimal graph structure---the edges used for communication in the optimal schedule---and therefore informs the optimal design of large-scale organizations. Moreover, since pre-existing organizations may want to optimize their workflow, we also study this problem on arbitrary graphs. We demonstrate that our problem on arbitrary graphs is not only NP-hard but also hard to approximate within a multiplicative $1.5$ factor. Finally, we give an $O(\log n \cdot \log \frac{\text{OPT}}{t_m})$-approximation algorithm for our problem on arbitrary graphs.


          BinGAN: Learning Compact Binary Descriptors with a Regularized GAN. (arXiv:1806.06778v4 [cs.CV] UPDATED)      Cache   Translate Page      

Authors: Maciej Zieba, Piotr Semberecki, Tarek El-Gaaly, Tomasz Trzcinski

In this paper, we propose a novel regularization method for Generative Adversarial Networks, which allows the model to learn discriminative yet compact binary representations of image patches (image descriptors). We employ the dimensionality reduction that takes place in the intermediate layers of the discriminator network and train binarized low-dimensional representation of the penultimate layer to mimic the distribution of the higher-dimensional preceding layers. To achieve this, we introduce two loss terms that aim at: (i) reducing the correlation between the dimensions of the binarized low-dimensional representation of the penultimate layer i. e. maximizing joint entropy) and (ii) propagating the relations between the dimensions in the high-dimensional space to the low-dimensional space. We evaluate the resulting binary image descriptors on two challenging applications, image matching and retrieval, and achieve state-of-the-art results.


          Modeling Multi-turn Conversation with Deep Utterance Aggregation. (arXiv:1806.09102v2 [cs.CL] UPDATED)      Cache   Translate Page      

Authors: Zhuosheng Zhang, Jiangtong Li, Pengfei Zhu, Hai Zhao, Gongshen Liu

Multi-turn conversation understanding is a major challenge for building intelligent dialogue systems. This work focuses on retrieval-based response matching for multi-turn conversation whose related work simply concatenates the conversation utterances, ignoring the interactions among previous utterances for context modeling. In this paper, we formulate previous utterances into context using a proposed deep utterance aggregation model to form a fine-grained context representation. In detail, a self-matching attention is first introduced to route the vital information in each utterance. Then the model matches a response with each refined utterance and the final matching score is obtained after attentive turns aggregation. Experimental results show our model outperforms the state-of-the-art methods on three multi-turn conversation benchmarks, including a newly introduced e-commerce dialogue corpus.


          MetaAnchor: Learning to Detect Objects with Customized Anchors. (arXiv:1807.00980v2 [cs.CV] UPDATED)      Cache   Translate Page      

Authors: Tong Yang, Xiangyu Zhang, Zeming Li, Wenqiang Zhang, Jian Sun

We propose a novel and flexible anchor mechanism named MetaAnchor for object detection frameworks. Unlike many previous detectors model anchors via a predefined manner, in MetaAnchor anchor functions could be dynamically generated from the arbitrary customized prior boxes. Taking advantage of weight prediction, MetaAnchor is able to work with most of the anchor-based object detection systems such as RetinaNet. Compared with the predefined anchor scheme, we empirically find that MetaAnchor is more robust to anchor settings and bounding box distributions; in addition, it also shows the potential on transfer tasks. Our experiment on COCO detection task shows that MetaAnchor consistently outperforms the counterparts in various scenarios.


          A Weakly Supervised Adaptive DenseNet for Classifying Thoracic Diseases and Identifying Abnormalities. (arXiv:1807.01257v2 [cs.CV] UPDATED)      Cache   Translate Page      

Authors: Bo Zhou, Yuemeng Li, Jiangcong Wang

We present a weakly supervised deep learning model for classifying thoracic diseases and identifying abnormalities in chest radiography. In this work, instead of learning from medical imaging data with region-level annotations, our model was merely trained on imaging data with image-level labels to classify diseases, and is able to identify abnormal image regions simultaneously. Our model consists of a customized pooling structure and an adaptive DenseNet front-end, which can effectively recognize possible disease features for classification and localization tasks. Our method has been validated on the publicly available ChestX-ray14 dataset. Experimental results have demonstrated that our classification and localization prediction performance achieved significant improvement over the previous models on the ChestX-ray14 dataset. In summary, our network can produce accurate disease classification and localization, which can potentially support clinical decisions.


          Case for the double-blind peer review. (arXiv:1807.01408v3 [cs.DL] UPDATED)      Cache   Translate Page      

Authors: Lucie Tvrznikova

Peer review is a process designed to produce a fair assessment of research quality before the publication of scholarly work in a journal. Demographics, nepotism, and seniority have been all shown to affect reviewer behavior suggesting the most common, single-blind review method (or the less common open review method) might be biased. A survey of current research indicates that double-blind review offers a solution to many biases stemming from author's gender, seniority, or location without imposing any significant downsides.


          Private Data Objects: an Overview. (arXiv:1807.05686v2 [cs.CR] UPDATED)      Cache   Translate Page      

Authors: Mic Bowman, Andrea Miele, Michael Steiner, Bruno Vavala

We present Private Data Objects (PDOs), a technology that enables mutually untrusted parties to run smart contracts over private data. PDOs result from the integration of a distributed ledger and Intel Secure Guard Extensions (SGX). In particular, contracts run off-ledger in secure enclaves using Intel SGX, which preserves data confidentiality, execution integrity and enforces data access policies (as opposed to raw data access). A distributed ledger verifies and records transactions produced by PDOs, in order to provide a single authoritative instance of such objects. This allows contracting parties to retrieve and check data related to contract and enclave instances, as well as to serialize and commit contract state updates. The design and the development of PDOs is an ongoing research effort, and open source code is available and hosted by Hyperledger Labs [5, 7].


          ENG: End-to-end Neural Geometry for Robust Depth and Pose Estimation using CNNs. (arXiv:1807.05705v2 [cs.CV] UPDATED)      Cache   Translate Page      

Authors: Thanuja Dharmasiri, Andrew Spek, Tom Drummond

Recovering structure and motion parameters given a image pair or a sequence of images is a well studied problem in computer vision. This is often achieved by employing Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM) algorithms based on the real-time requirements. Recently, with the advent of Convolutional Neural Networks (CNNs) researchers have explored the possibility of using machine learning techniques to reconstruct the 3D structure of a scene and jointly predict the camera pose. In this work, we present a framework that achieves state-of-the-art performance on single image depth prediction for both indoor and outdoor scenes. The depth prediction system is then extended to predict optical flow and ultimately the camera pose and trained end-to-end. Our motion estimation framework outperforms the previous motion prediction systems and we also demonstrate that the state-of-the-art metric depths can be further improved using the knowledge of pose.


          The Online $k$-Taxi Problem. (arXiv:1807.06645v2 [cs.DS] UPDATED)      Cache   Translate Page      

Authors: Christian Coester, Elias Koutsoupias

We consider the online $k$-taxi problem, a generalization of the $k$-server problem, in which $k$ taxis serve a sequence of requests in a metric space. A request consists of two points $s$ and $t$, representing a passenger that wants to be carried by a taxi from $s$ to $t$. The goal is to serve all requests while minimizing the total distance traveled by all taxis. The problem comes in two flavors, called the easy and the hard $k$-taxi problem: In the easy $k$-taxi problem, the cost is defined as the total distance traveled by the taxis; in the hard $k$-taxi problem, the cost is only the distance of empty runs.

The hard $k$-taxi problem is substantially more difficult than the easy version with at least an exponential deterministic competitive ratio, $\Omega(2^k)$, admitting a reduction from the layered graph traversal problem. In contrast, the easy $k$-taxi problem has exactly the same competitive ratio as the $k$-server problem. We focus mainly on the hard version. For hierarchically separated trees (HSTs), we present a memoryless randomized algorithm with competitive ratio $2^k-1$ against adaptive online adversaries and provide two matching lower bounds: for arbitrary algorithms against adaptive adversaries and for memoryless algorithms against oblivious adversaries. Due to well-known HST embedding techniques, the algorithm implies a randomized $O(2^k\log n)$-competitive algorithm for arbitrary $n$-point metrics. This is the first competitive algorithm for the hard $k$-taxi problem for general finite metric spaces and general $k$. For the special case of $k=2$, we obtain a precise answer of $9$ for the competitive ratio in general metrics. With an algorithm based on growing, shrinking and shifting regions, we show that one can achieve a constant competitive ratio also for the hard $3$-taxi problem on the line (abstracting the scheduling of three elevators).


          Robot Motion Planning in Learned Latent Spaces. (arXiv:1807.10366v2 [cs.RO] UPDATED)      Cache   Translate Page      

Authors: Brian Ichter, Marco Pavone

This paper presents Latent Sampling-based Motion Planning (L-SBMP), a methodology towards computing motion plans for complex robotic systems by learning a plannable latent representation. Recent works in control of robotic systems have effectively leveraged local, low-dimensional embeddings of high-dimensional dynamics. In this paper we combine these recent advances with techniques from sampling-based motion planning (SBMP) in order to design a methodology capable of planning for high-dimensional robotic systems beyond the reach of traditional approaches (e.g., humanoids, or even systems where planning occurs in the visual space). Specifically, the learned latent space is constructed through an autoencoding network, a dynamics network, and a collision checking network, which mirror the three main algorithmic primitives of SBMP, namely state sampling, local steering, and collision checking. Notably, these networks can be trained through only raw data of the system's states and actions along with a supervising collision checker. Building upon these networks, an RRT-based algorithm is used to plan motions directly in the latent space - we refer to this exploration algorithm as Learned Latent RRT (L2RRT). This algorithm globally explores the latent space and is capable of generalizing to new environments. The overall methodology is demonstrated on two planning problems, namely a visual planning problem, whereby planning happens in the visual (pixel) space, and a humanoid robot planning problem.


          Making Classifier Chains Resilient to Class Imbalance. (arXiv:1807.11393v4 [cs.LG] UPDATED)      Cache   Translate Page      

Authors: Bin Liu, Grigorios Tsoumakas

Class imbalance is an intrinsic characteristic of multi-label data. Most of the labels in multi-label data sets are associated with a small number of training examples, much smaller compared to the size of the data set. Class imbalance poses a key challenge that plagues most multi-label learning methods. Ensemble of Classifier Chains (ECC), one of the most prominent multi-label learning methods, is no exception to this rule, as each of the binary models it builds is trained from all positive and negative examples of a label. To make ECC resilient to class imbalance, we first couple it with random undersampling. We then present two extensions of this basic approach, where we build a varying number of binary models per label and construct chains of different sizes, in order to improve the exploitation of majority examples with approximately the same computational budget. Experimental results on 16 multi-label datasets demonstrate the effectiveness of the proposed approaches in a variety of evaluation metrics.


          CT-Wasm: Type-driven Secure Cryptography for the Web Ecosystem. (arXiv:1808.01348v2 [cs.CR] UPDATED)      Cache   Translate Page      

Authors: Conrad Watt, John Renner, Natalie Popescu, Sunjay Cauligi, Deian Stefan

A significant amount of both client and server-side cryptography is implemented in JavaScript. Despite widespread concerns about its security, no other language has been able to match the convenience that comes from its ubiquitous support on the "web ecosystem" - the wide variety of technologies that collectively underpins the modern World Wide Web. With the new introduction of the WebAssembly bytecode language (Wasm) into the web ecosystem, we have a unique opportunity to advance a principled alternative to existing JavaScript cryptography use cases which does not compromise this convenience.

We present Constant-Time WebAssembly (CT-Wasm), a type-driven strict extension to WebAssembly which facilitates the verifiably secure implementation of cryptographic algorithms. CT-Wasm's type system ensures that code written in CT-Wasm is both information flow secure and resistant to timing side channel attacks; like base Wasm, these guarantees are verifiable in linear time. Building on an existing Wasm mechanization, we mechanize the full CT-Wasm specification, prove soundness of the extended type system, implement a verified type checker, and give several proofs of the language's security properties.

We provide two implementations of CT-Wasm: an OCaml reference interpreter and a native implementation for Node.js and Chromium that extends Google's V8 engine. We also implement a CT-Wasm to Wasm rewrite tool that allows developers to reap the benefits of CT-Wasm's type system today, while developing cryptographic algorithms for base Wasm environments. We evaluate the language, our implementations, and supporting tools by porting several cryptographic primitives - Salsa20, SHA-256, and TEA - and the full TweetNaCl library. We find that CT-Wasm is fast, expressive, and generates code that we experimentally measure to be constant-time.


          On the Relation Between Mobile Encounters and Web Traffic Patterns: A Data-driven Study. (arXiv:1808.03842v3 [cs.NI] UPDATED)      Cache   Translate Page      

Authors: Babak Alipour, Mimonah Al Qathrady, Ahmed Helmy

Mobility and network traffic have been traditionally studied separately. Their interaction is vital for generations of future mobile services and effective caching, but has not been studied in depth with real-world big data. In this paper, we characterize mobility encounters and study the correlation between encounters and web traffic profiles using large-scale datasets (30TB in size) of WiFi and NetFlow traces. The analysis quantifies these correlations for the first time, across spatio-temporal dimensions, for device types grouped into on-the-go Flutes and sit-to-use Cellos. The results consistently show a clear relation between mobility encounters and traffic across different buildings over multiple days, with encountered pairs showing higher traffic similarity than non-encountered pairs, and long encounters being associated with the highest similarity. We also investigate the feasibility of learning encounters through web traffic profiles, with implications for dissemination protocols, and contact tracing. This provides a compelling case to integrate both mobility and web traffic dimensions in future models, not only at an individual level, but also at pairwise and collective levels. We have released samples of code and data used in this study on GitHub, to support reproducibility and encourage further research (https://github.com/BabakAp/encounter-traffic).


          AMoDSim: An Efficient and Modular Simulation Framework for Autonomous Mobility on Demand. (arXiv:1808.04813v2 [cs.MA] UPDATED)      Cache   Translate Page      

Authors: Andrea Di Maria, Andrea Araldo, Giovanni Morana, Antonella Di Stefano

Urban transportation of next decade is expected to be disrupted by Autonomous Mobility on Demand (AMoD): AMoD providers will collect ride requests from users and will dispatch a fleet of autonomous vehicles to satisfy requests in the most efficient way. Differently from current ride sharing systems, in which driver behavior has a clear impact on the system, AMoD systems will be exclusively determined by the dispatching logic. As a consequence, a recent interest in the Operations Research and Computer Science communities has focused on this control logic. The new propositions and methodologies are generally evaluated via simulation. Unfortunately, there is no simulation platform that has emerged as reference, with the consequence that each author uses her own custom-made simulator, applicable only in her specific study, with no aim of generalization and without public release. This slows down the progress in the area as researchers cannot build on each other's work and cannot share, reproduce and verify the results. The goal of this paper is to present AMoDSim, an open-source simulation platform aimed to fill this gap and accelerate research in future ride sharing systems.


          Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks. (arXiv:1808.06719v2 [cs.SD] UPDATED)      Cache   Translate Page      

Authors: Sercan O. Arik, Heewoo Jun, Gregory Diamos

We propose the multi-head convolutional neural network (MCNN) architecture for waveform synthesis from spectrograms. Nonlinear interpolation in MCNN is employed with transposed convolution layers in parallel heads. MCNN achieves more than an order of magnitude higher compute intensity than commonly-used iterative algorithms like Griffin-Lim, yielding efficient utilization for modern multi-core processors, and very fast (more than 300x real-time) waveform synthesis. For training of MCNN, we use a large-scale speech recognition dataset and losses defined on waveforms that are related to perceptual audio quality. We demonstrate that MCNN constitutes a very promising approach for high-quality speech synthesis, without any iterative algorithms or autoregression in computations.


          Relaxed Voronoi: a Simple Framework for Terminal-Clustering Problems. (arXiv:1809.00942v2 [cs.DS] UPDATED)      Cache   Translate Page      

Authors: Arnold Filtser, Robert Krauthgamer, Ohad Trabelsi

We reprove three known algorithmic bounds for terminal-clustering problems, using a single framework that leads to simpler proofs. In this genre of problems, the input is a metric space $(X,d)$ (possibly arising from a graph) and a subset of terminals $K\subset X$, and the goal is to partition the points $X$ such that each part, called a cluster, contains exactly one terminal (possibly with connectivity requirements) so as to minimize some objective. The three bounds we reprove are for Steiner Point Removal on trees [Gupta, SODA 2001], for Metric $0$-Extension in bounded doubling dimension [Lee and Naor, unpublished 2003], and for Connected Metric $0$-Extension [Englert et al., SICOMP 2014].

A natural approach is to cluster each point with its closest terminal, which would partition $X$ into so-called Voronoi cells, but this approach can fail miserably due to its stringent cluster boundaries. A now-standard fix, which we call the Relaxed-Voronoi framework, is to use enlarged Voronoi cells, but to obtain disjoint clusters, the cells are computed greedily according to some order. This method, first proposed by Calinescu, Karloff and Rabani [SICOMP 2004], was employed successfully to provide state-of-the-art results for terminal-clustering problems on general metrics. However, for restricted families of metrics, e.g., trees and doubling metrics, only more complicated, ad-hoc algorithms are known. Our main contribution is to demonstrate that the Relaxed-Voronoi algorithm is applicable to restricted metrics, and actually leads to relatively simple algorithms and analyses.


          Extending set functors to generalised metric spaces. (arXiv:1809.02229v2 [math.CT] UPDATED)      Cache   Translate Page      

Authors: Adriana Balan, Alexander Kurz, Jiří Velebil

For a commutative quantale $\mathcal{V}$, the category $\mathcal{V}-cat$ can be perceived as a category of generalised metric spaces and non-expanding maps. We show that any type constructor $T$ (formalised as an endofunctor on sets) can be extended in a canonical way to a type constructor $T_{\mathcal{V}}$ on $\mathcal{V}-cat$. The proof yields methods of explicitly calculating the extension in concrete examples, which cover well-known notions such as the Pompeiu-Hausdorff metric as well as new ones.

Conceptually, this allows us to to solve the same recursive domain equation $X\cong TX$ in different categories (such as sets and metric spaces) and we study how their solutions (that is, the final coalgebras) are related via change of base.

Mathematically, the heart of the matter is to show that, for any commutative quantale $\mathcal{V}$, the `discrete' functor $D:\mathsf{Set}\to \mathcal{V}-cat$ from sets to categories enriched over $\mathcal{V}$ is $\mathcal{V}-cat$-dense and has a density presentation that allows us to compute left-Kan extensions along $D$.


          Data Augmentation for Spoken Language Understanding via Joint Variational Generation. (arXiv:1809.02305v2 [cs.CL] UPDATED)      Cache   Translate Page      

Authors: Kang Min Yoo, Youhyun Shin, Sang-goo Lee

Data scarcity is one of the main obstacles of domain adaptation in spoken language understanding (SLU) due to the high cost of creating manually tagged SLU datasets. Recent works in neural text generative models, particularly latent variable models such as variational autoencoder (VAE), have shown promising results in regards to generating plausible and natural sentences. In this paper, we propose a novel generative architecture which leverages the generative power of latent variable models to jointly synthesize fully annotated utterances. Our experiments show that existing SLU models trained on the additional synthetic examples achieve performance gains. Our approach not only helps alleviate the data scarcity issue in the SLU task for many datasets but also indiscriminately improves language understanding performances for various SLU models, supported by extensive experiments and rigorous statistical testing.


          Design Automation for Adiabatic Circuits. (arXiv:1809.02421v2 [cs.ET] UPDATED)      Cache   Translate Page      

Authors: Alwin Zulehner, Michael P. Frank, Robert Wille

Adiabatic circuits are heavily investigated since they allow for computations with an asymptotically close to zero energy dissipation per operation - serving as an alternative technology for many scenarios where energy efficiency is preferred over fast execution. Their concepts are motivated by the fact that the information lost from conventional circuits results in an entropy increase which causes energy dissipation. To overcome this issue, computations are performed in a (conditionally) reversible fashion which, additionally, have to satisfy switching rules that are different from conventional circuitry - crying out for dedicated design automation solutions. While previous approaches either focus on their electrical realization (resulting in small, hand-crafted circuits only) or on designing fully reversible building blocks (an unnecessary overhead), this work aims for providing an automatic and dedicated design scheme that explicitly takes the recent findings in this domain into account. To this end, we review the theoretical and technical background of adiabatic circuits and present automated methods that dedicatedly realize the desired function as an adiabatic circuit. The resulting methods are further optimized - leading to an automatic and efficient design automation for this promising technology. Evaluations confirm the benefits and applicability of the proposed solution.


          The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure. (arXiv:1809.03063v2 [cs.LG] UPDATED)      Cache   Translate Page      

Authors: Saeed Mahloujifar, Dimitrios I. Diochnos, Mohammad Mahmoody

Many modern machine learning classifiers are shown to be vulnerable to adversarial perturbations of the instances. Despite a massive amount of work focusing on making classifiers robust, the task seems quite challenging. In this work, through a theoretical study, we investigate the adversarial risk and robustness of classifiers and draw a connection to the well-known phenomenon of concentration of measure in metric measure spaces. We show that if the metric probability space of the test instance is concentrated, any classifier with some initial constant error is inherently vulnerable to adversarial perturbations.

One class of concentrated metric probability spaces are the so-called Levy families that include many natural distributions. In this special case, our attacks only need to perturb the test instance by at most $O(\sqrt n)$ to make it misclassified, where $n$ is the data dimension. Using our general result about Levy instance spaces, we first recover as special case some of the previously proved results about the existence of adversarial examples. However, many more Levy families are known (e.g., product distribution under the Hamming distance) for which we immediately obtain new attacks that find adversarial examples of distance $O(\sqrt n)$.

Finally, we show that concentration of measure for product spaces implies the existence of forms of "poisoning" attacks in which the adversary tampers with the training data with the goal of degrading the classifier. In particular, we show that for any learning algorithm that uses $m$ training examples, there is an adversary who can increase the probability of any "bad property" (e.g., failing on a particular test instance) that initially happens with non-negligible probability to $\approx 1$ by substituting only $\tilde{O}(\sqrt m)$ of the examples with other (still correctly labeled) examples.


          Synthetic Occlusion Augmentation with Volumetric Heatmaps for the 2018 ECCV PoseTrack Challenge on 3D Human Pose Estimation. (arXiv:1809.04987v3 [cs.CV] UPDATED)      Cache   Translate Page      

Authors: István Sárándi, Timm Linder, Kai O. Arras, Bastian Leibe

In this paper we present our winning entry at the 2018 ECCV PoseTrack Challenge on 3D human pose estimation. Using a fully-convolutional backbone architecture, we obtain volumetric heatmaps per body joint, which we convert to coordinates using soft-argmax. Absolute person center depth is estimated by a 1D heatmap prediction head. The coordinates are back-projected to 3D camera space, where we minimize the L1 loss. Key to our good results is the training data augmentation with randomly placed occluders from the Pascal VOC dataset. In addition to reaching first place in the Challenge, our method also surpasses the state-of-the-art on the full Human3.6M benchmark among methods that use no additional pose datasets in training. Code for applying synthetic occlusions is availabe at https://github.com/isarandi/synthetic-occlusion.


          The Reachability Problem for Petri Nets is Not Elementary (Extended Abstract). (arXiv:1809.07115v2 [cs.FL] UPDATED)      Cache   Translate Page      

Authors: Wojciech Czerwinski, Slawomir Lasota, Ranko Lazic, Jerome Leroux, Filip Mazowiecki

Petri nets, also known as vector addition systems, are a long established and widely used model of concurrent processes. The complexity of their reachability problem is one of the most prominent open questions in the theory of verification. That the reachability problem is decidable was established by Mayr in his seminal STOC 1981 work, and the currently best upper bound is non-primitive recursive cubic-Ackermannian of Leroux and Schmitz from LICS 2015. We show that the reachability problem is not elementary. Until this work, the best lower bound has been exponential space, due to Lipton in 1976.


          Classify, predict, detect, anticipate and synthesize: Hierarchical recurrent latent variable models for human activity modeling. (arXiv:1809.08875v2 [cs.CV] UPDATED)      Cache   Translate Page      

Authors: Judith Bütepage, Hedvig Kjellström, Danica Kragic

Human activity modeling operates on two levels: high-level action modeling, such as classification, prediction, detection and anticipation, and low-level motion trajectory prediction and synthesis. In this work, we propose a semi-supervised generative latent variable model that addresses both of these levels by modeling continuous observations as well as semantic labels. We extend the model to capture the dependencies between different entities, such as a human and objects, and to represent hierarchical label structure, such as high-level actions and sub-activities. In the experiments we investigate our model's capability to classify, predict, detect and anticipate semantic action and affordance labels and to predict and generate human motion. We train our models on data extracted from depth image streams from the Cornell Activity 120, the UTKinect-Action3D and the Stony Brook University Kinect Interaction Dataset. We observe that our model performs well in all of the tasks and often outperforms task-specific models.


          Knowledge Graph Error detection and Completion. (arXiv:1809.09414v2 [cs.AI] UPDATED)      Cache   Translate Page      

Authors: Shengbin Jia

In the era of big data, people face enormous challenges in acquiring information and knowledge. A knowledge graph (KG) lays the foundation for the knowledge-based organization and intelligent application in the Internet age with its powerful semantic processing capabilities and open organization capabilities. In recent years, the research and applications of large-scale knowledge graph libraries have attracted increasing attention in academic and industrial circles. The knowledge graph aims to describe the various entities or concepts and their relationships existing in the objective world, which constitutes a huge semantic network map. It usually stores knowledge in the form of triples (head entity, relationship, tail entity), which can be simplified to $(h, r, t)$.


          A Roadmap Towards Resilient Internet of Things for Cyber-Physical Systems. (arXiv:1810.06870v2 [cs.ET] UPDATED)      Cache   Translate Page      

Authors: Denise Ratasich, Faiq Khalid, Florian Geissler, Radu Grosu, Muhammad Shafique, Ezio Bartocci

The Internet of Things (IoT) is a ubiquitous system connecting many different devices - the things - which can be accessed from the distance. The cyber-physical systems (CPS) monitor and control the things from the distance. As a result, the concepts of dependability and security get deeply intertwined. The increasing level of dynamicity, heterogeneity, and complexity adds to the system's vulnerability, and challenges its ability to react to faults. This paper summarizes state-of-the-art of existing work on anomaly detection, fault-tolerance and self-healing, and adds a number of other methods applicable to achieve resilience in an IoT. We particularly focus on non-intrusive methods ensuring data integrity in the network. Furthermore, this paper presents the main challenges in building a resilient IoT for CPS which is crucial in the era of smart CPS with enhanced connectivity (an excellent example of such a system is connected autonomous vehicles). It further summarizes our solutions, work-in-progress and future work to this topic to enable "Trustworthy IoT for CPS". Finally, this framework is illustrated on a selected use case: A smart sensor infrastructure in the transport domain.


          On the Identifiability of the Influence Model for Stochastic Spatiotemporal Spread Processes. (arXiv:1810.11548v2 [cs.SY] UPDATED)      Cache   Translate Page      

Authors: Chenyuan He, Yan Wan, Frank L. Lewis

The influence model is a discrete-time stochastic model that succinctly captures the interactions of a network of Markov chains. The model produces a reduced-order representation of the stochastic network, and can be used to describe and tractably analyze probabilistic spatiotemporal spread dynamics, and hence has found broad usage in network applications such as social networks, traffic management, and failure cascades in power systems. This paper provides sufficient and necessary conditions for the identifiability of the influence model, and also develops estimators for the model structure through exploiting the model's special properties. In addition, we analyze conditions for the identifiability of the partially observed influence model (POIM), for which not all of the sites can be measured.


          Learning Abstract Options. (arXiv:1810.11583v2 [cs.LG] UPDATED)      Cache   Translate Page      

Authors: Matthew Riemer, Miao Liu, Gerald Tesauro

Building systems that autonomously create temporal abstractions from data is a key challenge in scaling learning and planning in reinforcement learning. One popular approach for addressing this challenge is the options framework (Sutton et al., 1999). However, only recently in (Bacon et al., 2017) was a policy gradient theorem derived for online learning of general purpose options in an end to end fashion. In this work, we extend previous work on this topic that only focuses on learning a two-level hierarchy including options and primitive actions to enable learning simultaneously at multiple resolutions in time. We achieve this by considering an arbitrarily deep hierarchy of options where high level temporally extended options are composed of lower level options with finer resolutions in time. We extend results from (Bacon et al., 2017) and derive policy gradient theorems for a deep hierarchy of options. Our proposed hierarchical option-critic architecture is capable of learning internal policies, termination conditions, and hierarchical compositions over options without the need for any intrinsic rewards or subgoals. Our empirical results in both discrete and continuous environments demonstrate the efficiency of our framework.


          A Cross-Modal Distillation Network for Person Re-identification in RGB-Depth. (arXiv:1810.11641v2 [cs.CV] UPDATED)      Cache   Translate Page      

Authors: Frank Hafner, Amran Bhuiyan, Julian F. P. Kooij, Eric Granger

Person re-identification involves the recognition over time of individuals captured using multiple distributed sensors. With the advent of powerful deep learning methods able to learn discriminant representations for visual recognition, cross-modal person re-identification based on different sensor modalities has become viable in many challenging applications in, e.g., autonomous driving, robotics and video surveillance. Although some methods have been proposed for re-identification between infrared and RGB images, few address depth and RGB images. In addition to the challenges for each modality associated with occlusion, clutter, misalignment, and variations in pose and illumination, there is a considerable shift across modalities since data from RGB and depth images are heterogeneous. In this paper, a new cross-modal distillation network is proposed for robust person re-identification between RGB and depth sensors. Using a two-step optimization process, the proposed method transfers supervision between modalities such that similar structural features are extracted from both RGB and depth modalities, yielding a discriminative mapping to a common feature space. Our experiments investigate the influence of the dimensionality of the embedding space, compares transfer learning from depth to RGB and vice versa, and compares against other state-of-the-art cross-modal re-identification methods. Results obtained with BIWI and RobotPKU datasets indicate that the proposed method can successfully transfer descriptive structural features from the depth modality to the RGB modality. It can significantly outperform state-of-the-art conventional methods and deep neural networks for cross-modal sensing between RGB and depth, with no impact on computational complexity.


          Emotional End-to-End Neural Speech Synthesizer. (arXiv:1711.05447v2 [cs.SD] CROSS LISTED)      Cache   Translate Page      

Authors: Younggun Lee, Azam Rabiee, Soo-Young Lee

In this paper, we introduce an emotional speech synthesizer based on the recent end-to-end neural model, named Tacotron. Despite its benefits, we found that the original Tacotron suffers from the exposure bias problem and irregularity of the attention alignment. Later, we address the problem by utilization of context vector and residual connection at recurrent neural networks (RNNs). Our experiments showed that the model could successfully train and generate speech for given emotion labels.


          Optimal Weighting for Exam Composition. (arXiv:1801.06043v1 [cs.CY] CROSS LISTED)      Cache   Translate Page      

Authors: Sam Ganzfried, Farzana Yusuf

A problem faced by many instructors is that of designing exams that accurately assess the abilities of the students. Typically these exams are prepared several days in advance, and generic question scores are used based on rough approximation of the question difficulty and length. For example, for a recent class taught by the author, there were 30 multiple choice questions worth 3 points, 15 true/false with explanation questions worth 4 points, and 5 analytical exercises worth 10 points. We describe a novel framework where algorithms from machine learning are used to modify the exam question weights in order to optimize the exam scores, using the overall class grade as a proxy for a student's true ability. We show that significant error reduction can be obtained by our approach over standard weighting schemes, and we make several new observations regarding the properties of the "good" and "bad" exam questions that can have impact on the design of improved future evaluation methods.


          Predicting Hurricane Trajectories using a Recurrent Neural Network. (arXiv:1802.02548v3 [cs.LG] CROSS LISTED)      Cache   Translate Page      

Authors: Sheila Alemany, Jonathan Beltran, Adrian Perez, Sam Ganzfried

Hurricanes are cyclones circulating about a defined center whose closed wind speeds exceed 75 mph originating over tropical and subtropical waters. At landfall, hurricanes can result in severe disasters. The accuracy of predicting their trajectory paths is critical to reduce economic loss and save human lives. Given the complexity and nonlinearity of weather data, a recurrent neural network (RNN) could be beneficial in modeling hurricane behavior. We propose the application of a fully connected RNN to predict the trajectory of hurricanes. We employed the RNN over a fine grid to reduce typical truncation errors. We utilized their latitude, longitude, wind speed, and pressure publicly provided by the National Hurricane Center (NHC) to predict the trajectory of a hurricane at 6-hour intervals. Results show that this proposed technique is competitive to methods currently employed by the NHC and can predict up to approximately 120 hours of hurricane path.


          DiCE: The Infinitely Differentiable Monte-Carlo Estimator. (arXiv:1802.05098v3 [cs.LG] CROSS LISTED)      Cache   Translate Page      

Authors: Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson

The score function estimator is widely used for estimating gradients of stochastic objectives in stochastic computation graphs (SCG), eg, in reinforcement learning and meta-learning. While deriving the first-order gradient estimators by differentiating a surrogate loss (SL) objective is computationally and conceptually simple, using the same approach for higher-order derivatives is more challenging. Firstly, analytically deriving and implementing such estimators is laborious and not compliant with automatic differentiation. Secondly, repeatedly applying SL to construct new objectives for each order derivative involves increasingly cumbersome graph manipulations. Lastly, to match the first-order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for estimators of higher-order derivatives. To address all these shortcomings in a unified way, we introduce DiCE, which provides a single objective that can be differentiated repeatedly, generating correct estimators of derivatives of any order in SCGs. Unlike SL, DiCE relies on automatic differentiation for performing the requisite graph manipulations. We verify the correctness of DiCE both through a proof and numerical evaluation of the DiCE derivative estimates. We also use DiCE to propose and evaluate a novel approach for multi-agent learning. Our code is available at https://www.github.com/alshedivat/lola.


           'Oumuamua 可能来自外星文明      Cache   Translate Page      
去年 10 月,天文学家利用夏威夷的巡天望远镜 Pan-STARRS1 发现了一颗奇怪的天体 `Oumuamua。`Oumuamua 在夏威夷语中的意思是 “第一信使”,它被认为是人类已知的第一颗星际天体。访问太阳系的星际天体肯定很多,而它是第一颗被人类密切观察并确认的星际天体。`Oumuamua 有着雪茄外形,与我们在太阳系观察到的天体存在显著差异,它正在加速远离太阳系,预计永远不会再回来。但它究竟是小行星还是彗星?或者是其它东西?哈佛天文系主任 Avi Loeb 教授和博士后 Shmuel Bialy 在《The Astrophysical Journal Letters 》期刊上发表论文(预印本),认为 `Oumuamua 是来自外星文明的宇宙飞船。他们认为 `Oumuamua 是利用恒星辐射压做为推进力的光帆或太阳帆


          OBJETO QUE CRUZOU SISTEMA SOLAR PODE SER NAVE ALIENÍGENA, DIZ ESTUDO DE HARVARD      Cache   Translate Page      


Astrônomos levantam possibilidade de corpo rochoso apelidado de Oumuamua ter sido enviado por civilização alienígena


Um misterioso objeto rochoso em formato de charuto que cruzou nosso sistema solar no ano passado pode ser uma espaçonave alienígena, sugeriram astrônomos da Universidade de Harvard, nos Estados Unidos.

Batizado de Oumuamua, que significa “mensageiro de muito longe que chega primeiro” em havaiano, o objeto espacial foi o primeiro a viajar de outro sistema planetário para o nosso. Foi descoberto pelo telescópio Pan-STARRS 1, instalado no Havaí, em outubro de 2017.


Desde a sua passagem, os cientistas têm dificuldade em explicar suas características incomuns e sua origem precisa. Inicialmente, os pesquisadores o classificaram como um cometa e, depois, como asteroide, antes de finalmente considerá-lo um novo tipo de “objeto interestelar”.

Agora, um novo estudo de pesquisadores do Harvard Smithsonian Center for Astrophysics, da Universidade de Harvard, nos Estados Unidos, levanta a possibilidade de o objeto ter uma “origem artificial”.


“Oumuamua pode ser uma sonda totalmente operacional enviada intencionalmente à vizinhança terrestre por uma civilização alienígena”, escreveram os astrônomos no artigo, que foi submetido ao jornal científico americano Astrophysical Journal Letters.

Por Da Redação
==============================================================================
PATROCINADOR OFICIAL
FOLHA DO PARÁ






























 

 

          Where on Earth Can You Put a Giant Telescope?      Cache   Translate Page      
article-image

In 1963, the astronomer Gerard Kuiper hired a plane and flew above the clouds, to circle the summit of Mauna Kea, in Hawaii. He needed a mountain, and the first one he had seen here, Haleakala, disappointed—too much fog. But Mauna Kea, the tallest mountain in the Pacific, stretched even closer toward space. The air around its cinder cone is dry and chill, the weather calm and constant. Kuiper convinced the governor of the state to help plow a rough road to the summit and then spent months collecting data about the quality of the light that shines there. In the end, he was convinced that Mauna Kea was “probably the best site in the world” for an astronomer, the perfect place to see “the moon, the planets, and the stars." As he said at the dedication of the site—"It is a jewel!" By the end of the 1970s, four sophisticated telescopes would perch on the summit.

There are now 13 telescopes on Mauna Kea, and the international consortium building a new behemoth instrument, the Thirty Meter Telescope (TMT), plans to add another. The TMT group was convinced, just as Kuiper had been, that this would be the best site in the world for their project. They knew that it has “great cultural and archaeological significance to the local people,” but they went for it anyway. There were legal battles with locals who wanted protect the site’s heritage, but last week, after years of legal challenges and protests, the Hawaii Supreme Court approved the permit for the telescope’s construction.

When the TMT group set out to find a location for this unprecedented combination of optics and technology, it began by considering “all potentially interesting sites on Earth,” and ended up at perhaps the best known and most tested astronomical site on the planet. The same thing happened with another ambitious telescope project: The group building the Giant Magellan Telescope broke ground this summer at Las Campanas Observatory in Chile's Atacama Desert, another of the world’s premiere astronomical sites. Of course, any billion-dollar project will want to choose the best location available. What is it about these select mountaintops that makes them so irresistible to astronomers?

article-image

It’s simple, in a way. Astronomers want to capture light, clean and clear, as it streams down to Earth from impossibly distant stars and planets and nebulae, and they want to do that as many nights as they can each year. There are some obvious factors that obstruct that goal. Light pollution from nearby human settlements makes it hard to see the faintest objects. Wind can rattle a telescope and affect its accuracy. Clouds get in the way, especially for telescopes that operate in the visible light spectrum in particular. Site selection begins with the places on Earth with the greatest number of cloudless nights in a year. But even that is not enough.

"Once you have a found a clear enough place, you have to find an area that has little turbulence," explains Marc Sarazin, an applied physicist for the European Southern Observatory, in charge of site monitoring. This thermal turbulence is formed when hot and cold air masses change altitudes. "This creates what you see on a very hot summer day on the asphalt, as you drive down the road," says Sarazin. "Everything is moving; you don't have a sharp view. It's the same for astronomers when they look upwards, if there are layers that have been disturbed. The stars will not be so sharp."

This quality—the sharpness of the stars—is what astronomers call "seeing," and it's one of the most important criteria in selecting a site. But there are other details to consider, as well. Air that’s full of water vapor can fog and frost up instruments and disrupt the view of light in the infrared spectrum. Radio waves and microwave radiation can mess with telescopes, and a place that heats up during the day and cools down at night can be a problem, too.

Some of these parameters change depending on what type of observations astronomers are looking to make. An infrared telescope project might trade more cloudy nights, which are less of a concern for that end of the light spectrum, for a site with less humidity. "It’s all a matter of compromises," says Sarazin.

On top of all that, it helps if the people running and using the telescopes can get there with relative ease, which means roads. Astronomers and support crews need to be able to work comfortably. Mauna Kea is so high that altitude sickness slowed down the construction crews that built the first observatory there in the 1960s.

When, during the same era, Horace Babcock was looking for an observatory site in the southern hemisphere, a place where the Carnegie Institution for Science could lay the foundations for its future ambitions, he worried about the availability of water at one promising location. Las Campanas, he told an interviewer later, “right from the start, had a lot of appeal”—clear nights, excellent seeing—“except it looked as if there might be little available water.”

As basic infrastructure was built at some of these remote sites, they became even more attractive, in part because it can bring down the cost of a project. In its report on potential sites, the TMT group noted that “as a developed site with several observatories, much of the infrastructure required for TMT exists on Mauna Kea.” Las Campanas turned out to have water. And it had room for plenty of telescopes.

article-image

These days, site testers like Sarazin have a wealth of data they can use to assess potential sites without trekking up endless mountaintops. As a general rule, a site tester might start with the highest mountains, with the fewest neighbors, and narrow down their choices based on further data collected at a short list of sites. And even in northern Chile, there are still hundreds of summits that could, in theory, provide good astronomical conditions.

Given all the factors and compromises, there are three main types of places in the world that are most suited for telescopes observing visible light. One is Antarctica—the high peaks of arid plateaus have little turbulence and are surrounded by darkness. Conditions are brutal, but, advocates argue, sending a telescope to Antarctica is cheaper than sending one to space. The second is mountainous coastal areas, where the wind comes from the sea, minimizing turbulence whipping over the peaks. Chile fits this description. The third is an isolated mountain on a island, where all other conditions are right. Hawaii has mountains just like that. So does the Canary Islands. And that’s about it.

Astronomers do entertain the possibility that prime sites exist elsewhere. The TMT researchers noted that Uzbekistan has an excellent, unnamed site, and that northern Mexico and northern Africa have potential as well. There's some interest in Mount Kenya, and China may have any number of good sites. But the most obvious and most desirable places to put giant optical telescopes haven’t changed since the 1960s and 1970s.

"No one has ever done a comprehensive survey of the planet," says Sarazin. "We cannot say that we have looked everywhere. We know more or less the areas which could provide sites, but individual mountains have not been all characterized, of course. So there is still work for site testers."


          Tenemos millones de "sociedades artificiales" avanzando hacia el conflicto social en un PC: esto es lo que aprendemos de ellas       Cache   Translate Page      

Tenemos millones de

La naturaleza del universo, el origen de la vida, los problemas del Milenio o el funcionamiento del cerebro: la ciencia contemporánea está llena de incógnitas fascinantes, pero si hay una con la que tenemos serios problemas esa es la sociedad. Y es que los conflictos sociales son un lío.

No porque no tengamos material con el que estudiarlo. De hecho, tenemos 7.500 millones de personas en el mundo y toda la historia de la humanidad para analizar los conflictos sociales. El problema es que podemos mirar, sí; pero no tenemos es una forma de experimentar.

No en sentido estricto. Pero desde hace años los científicos usan ‘sociedades artificiales’ para simular esos conflictos bajo diferentes variables en millones de ocasiones. Ahora un nuevo estudio sobre el conflicto religioso nos da claves sobre cómo pensar las sociedades contemporáneas.

Sociedades artificiales avanzando hacia el conflicto

Jehyun Sung 477894 Unsplash

Los investigadores de la Universidad de Oxford, de Boston y de Agder (en Noruega) han usado una sociedad artificial usando los modelos actuales de psicología cognitiva; es decir, han desarrollado "agentes" de inteligencia artificial que intentan imitar el comportamiento humano teniendo en cuenta cosas como la religión o la identificación de grupo.

Programaron ese enjambre de agentes artificiales con diferentes edades y etnias y lo ejecutaron millones de veces para estudiar el conflicto social y religioso. El primer resultado fue sorprendente: en términos generales, solo el 25% de los escenarios estudiados acabaron en violencia.

Entre los resultados que sí acabaron en violencia, los investigadores descubrieron que esta ocurre con mayor frecuencia cuando los grupos estaban equilibrados en número y el contacto entre sus miembros era continuado. Ese contacto regular "hace aumentar los periodos prolongados de ansiedad", explicaba Ross Gore, profesor del Virginia Modeling, Analysis & Simulation Center de la Old Dominion University.

Ideas para entendernos mejor

Marion Michele 277282 Unsplash

Y la ansiedad, todo según Gore, tiene como efecto aumentar la cohesión ideológica o religiosa de los grupos: se radicalizan. Esa es la antesala del conflicto social. Por eso, las catástrofes naturales, las crisis económicas y los eventos que generan malestar social contribuyen a la radicalización.

Pero, más allá de esos factores exógenos, según los datos, es la exposición a colectivos sociales diferentes el factor clave. Evidentemente se trata de simulaciones, pero encajan con los estudios que señalan de que exponernos a ideas distintas a las nuestras terminan por radicalizarnos.

"Esto es importante porque significa que puede haber implicaciones en términos de políticas para reducir la ansiedad que se incrementa mutuamente a nivel individual", explicaba Gore. Y, aunque ponen encima de la mesa cuestiones muy peliagudas (¿Queremos una sociedad clusterizada, dividida, pilarizada?), la tecnología nos da elementos para avanzar en eso que comentaba al principio: un mejor conocimiento de los problemas sociales.

También te recomendamos

Cómo hacer fotos mejores utilizando el objetivo adecuado: 6 casos prácticos

El primer "cero" de la historia tiene 1600 años y está escrito sobre una corteza de abedul

Horóscopos y otras mentiras: por qué leer el horóscopo (a veces) nos hace sentirnos mejor

-
La noticia Tenemos millones de "sociedades artificiales" avanzando hacia el conflicto social en un PC: esto es lo que aprendemos de ellas fue publicada originalmente en Xataka por Javier Jiménez .


           Comment on Oumuamua by Caldwell Titcomb IV       Cache   Translate Page      
Paper of the Harvard guys: https://arxiv.org/pdf/1810.11490.pdf Page 4: "Since it is too late to image ‘Oumuamua with existing telescopes or chase it with chemical rockets (Seligman & Laughlin 2018), its likely origin and mechanical properties could only be deciphered by searching for other objects of its type in the future. "
          On similarities as a function of system size in heavy ion collisions      Cache   Translate Page      
arXiv:1811.02210

by: Petrovici, Mihai
Abstract:
Qualitative and quantitative similarities as a function of the system size in heavy ion collisions from low energy dissipative collisions, collective expansion of compressed baryonic matter up to the geometrical scaling evidenced at the highest energies presently attainable at LHC, are presented.
          Dominance of tensor correlations in high-momentum nucleon pairs studied by (p,pd) reaction      Cache   Translate Page      
arXiv:1811.02118

by: Terashima, S.
Abstract:
The isospin character of p-n pairs at large relative momentum has been observed for the first time in the 16O ground state. A strong population of the J,T=1,0 state and a very weak population of the J,T=0,1 state were observed in neutron pick up domain of 16O(p,pd) at 392 MeV. This strong isospin dependence at large momentum transfer is not reproduced by the distorted-wave impulse approximation calculations with known spectroscopic amplitudes. The results indicate the presence of high-momentum protons and neutrons induced by the tensor interactions in ground state of 16O.
          Theory of two-pion photo- and electroproduction off the nucleon      Cache   Translate Page      
arXiv:1811.01475

by: Haberzettl, Helmut
Abstract:
A theory of two-pion photo- and electroproduction off the nucleon is derived considering all explicit three-body mechanisms of the interacting $\pi\pi N$ system. The full three-body dynamics of the interacting $\pi\pi N$ system is accounted for by the Faddeev-type ordering structure of the Alt-Grassberger-Sandhas equations. The formulation is valid for hadronic two-point and three-point functions dressed by arbitrary internal mechanisms provided all associated electromagnetic currents are constructed to satisfy their respective (generalized) Ward-Takahashi identities. It is shown that coupling the photon to the Faddeev structure of the underlying hadronic two-pion production mechanisms results in a natural expansion of the full two-pion photoproduction current $M_{\pi\pi}^\mu$ in terms of multiple dressed loops involving two-body subsystem scattering amplitudes of the $\pi\pi N$ system that preserves gauge invariance as a matter of course order by order in the number of (dressed) loops. A closed-form expression is presented for the entire gauge-invariant current $M_{\pi\pi}^\mu$ with complete three-body dynamics. Individually gauge-invariant truncations of the full dynamics most relevant for practical applications at the no-loop, one-loop, and two-loop levels are discussed in detail. An approximation scheme to the full two-pion amplitude for calculational purposes is also presented. It approximates, systematically, the full amplitude to any desired order of expansion in the underlying hadronic two-body amplitude. Moreover, it allows for the approximate incorporation of all neglected higher-order mechanisms in terms of a phenomenological remainder current. The effect and phenomenological usefulness of this remainder current is assessed in a tree-level calculation of the $\gamma N \to K K \Xi$ reaction.
          Light-cone PDFs from Lattice QCD      Cache   Translate Page      
arXiv:1811.01588
PoS LATTICE2018 (2018) 094

by: Alexandrou, Constantia
Abstract:
Using the approach proposed a few years ago by X. Ji, it has become feasible to extract parton distribution functions (PDFs) from lattice QCD, a task thought to be extremely difficult before Ji's proposal. In this talk, we discuss this approach, in particular different systematic effects that need to be controlled to ultimately have precise determinations of PDFs. Special attention is paid to the analysis of excited states. We emphasize that it is crucial to control excited states contamination and we show an analysis thereof for our lattice data, used to calculate quasi-PDFs and finally light-cone PDFs in the second part of this proceeding (C. Alexandrou et al., Quasi-PDFs from Twisted mass fermions at the physical point).
          Experimental investigation of the low molecular weight fluoropolymer for the ultracold neutrons storage      Cache   Translate Page      
arXiv:1811.01859

by: Düsing, C.
Abstract:
The experimental setup for examining the low-molecular-weight fluoropolymer CF$_{3}$(CF$_{2})_{3}$-O-CF$_{2}$-O-(CF$_{2})_{3}$CF$_{3}$, which is a promising coating material for the walls of storage chambers for ultracold neutrons, is described. The results are detailed. The measurement data are interpreted in the model of a multilayer complex quantum-mechanical potential of the chamber walls.
          Magic numbers for shape coexistence      Cache   Translate Page      
arXiv:1811.01071

by: Assimakis, I.E.
Abstract:
The increasing deformation in atomic nuclei leads to the change of the classical magic numbers (2,8,20,28,50,82..) which dictate the arrangement of nucleons in complete shells. The magic numbers of the three-dimensional harmonic oscillator (2,8,20,40,70...) emerge at deformations around epsilon=0.6. At lower deformations the two sets of magic numbers antagonize, leading to shape coexistence. A quantitative investigation is performed using the usual Nilsson model wave functions and the recently introduced proxy-SU(3) scheme.
          Improved study of the $\beta$-function of $SU(3)$ gauge theory with $N_f = 10 $ massless domain-wall/overlap fermions      Cache   Translate Page      
arXiv:1811.01729
NTUTH-18-505A

by: Chiu, Ting-Wai
Abstract:
I perform an improved study of the $\beta$-function of $ SU(3) $ lattice gauge theory with $N_f=10$ massless optimal domain-wall fermions in the fundamental representation, which serves as a check to what extent the scenario presented in Refs. \cite{Chiu:2016uui,Chiu:2017kza} is valid. In the finite-volume gradient flow scheme with $ c = \sqrt{8t}/L = 0.3 $ \cite{Fodor:2012td}, the renormalized couplings $g^2 (L,a) $ of four primary lattices ($ L/a = \{ 8, 10, 12, 16 \}$) are tuned (in $ 6/g_0^2 $) to the same $ g_c^2 $ with statistical error less than $0.5 \% $, in contrast to Refs. \cite{Chiu:2016uui,Chiu:2017kza} where $ g^2(L,a) $ are obtained by the cubic-spline interpolation. Then the renormalized couplings $ g^2(sL, a) $ of the scaled lattices ($ sL/a = \{16, 20, 24, 32\} $ with $s=2$) are computed at the same $ 6/g_0^2 $ of the corresponding primary lattices. Using the renormalized couplings of four lattice pairs $ (sL,L)/a = \{ (16,8), (20,10), (24,12), (32,16) \} $, the step-scaling $\beta$-function $ [g^2(sL,a) - g^2(L,a)]/\ln (s^2) $ is computed and extrapolated to the continuum limit $ \beta(s,g_c^2) $, as summaried in Table III. Based on the four data points of $ \beta(s,g_c^2) $ at $ g_c^2 = \{ 6.86(2), \ 6.92(3), \ 7.03(2), \ 7.16(2) \} $, I infer that the theory is infrared conformal or near-conformal.
          Approximate sum rule for the electric dipole moment of light nuclei      Cache   Translate Page      
arXiv:1811.01841

by: Yamanaka, Nodoka
Abstract:
The measurement of the electric dipole moment (EDM) is an excellent test of the standard model of particle physics, and the detection of a finite value is signal of a new source of CP violation beyond it. Among systems for which the EDM can be measured, light nuclei are particularly interesting due to their high sensitivity to new physics. In this proceedings contribution, we examine the sensitivity of the EDM of several light nuclei to the CP-odd one pion-exchange nucleon-nucleon interaction within the cluster model. We suggest an approximate sum rule for the nuclear EDM.
          Localization in SU(3) gauge theory      Cache   Translate Page      
arXiv:1811.01887

by: Kovacs, Tamas G.
Abstract:
In this paper we study the localization transition of Dirac eigenmodes in quenched QCD. We determined the temperature dependence of the mobility edge in the quark-gluon plasma phase near the deconfining critical temperature. We calculated the critical temperature where all of the localized modes disappear from the spectrum and compared it with the critical temperature of the deconfining transition. We found that the localization transition happens at the same temperature as the deconfining transition which indicates a strong relation between the two phenomena.
          Transverse Momentum Resummation for $s$-channel single top quark production at the LHC      Cache   Translate Page      
arXiv:1811.01428
MSUHEP-18-020

by: Sun, Peng
Abstract:
We study the soft gluon radiation effects for the $s$-channel single top quark production at the LHC. By applying the transverse momentum dependent factorization formalism, the large logarithms about the small total transverse momentum ($q_\perp$) of the single-top plus one-jet final state system, are resummed to all orders in the expansion of the strong interaction coupling at the accuracy of Next-to-Leading Logarithm (NLL). We compare our numerical results with PYTHIA and find that both the $q_\perp$ and $\phi^*$ observables from PYTHIA are consistent with our prediction. Furthermore, we point out the soft gluon radiation effects from the final state become significant in this process, especially for the boosted kinematical region.
          Deflection angle of light for an observer and source at finite distance from a rotating global monopole      Cache   Translate Page      
arXiv:1811.01739

by: Ono, Toshiaki
Abstract:
By using a method improved with a generalized optical metric, the deflection of light for an observer and source at finite distance from a lens object in a stationary, axisymmetric and asymptotically flat spacetime has been recently discussed [Ono, Ishihara, Asada, Phys. Rev. D 96, 104037 (2017)]. In this paper, we study a possible extension of this method to an asymptotically nonflat spacetime. We discuss a rotating global monopole. Our result of the deflection angle of light is compared with a recent work on the same spacetime but limited within the asymptotic source and observer [Jusufi et al., Phys. Rev. D 95, 104012 (2017)], in which they employ another approach proposed by Werner with using the Nazim's osculating Riemannian construction method via the Randers-Finsler metric. We show that the two different methods give the same result in the asymptotically far limit. We obtain also the corrections to the deflection angle due to the finite distance from the rotating global monopole.
          Criticality and extended phase space thermodynamics of AdS black holes in higher curvature massive gravity      Cache   Translate Page      
arXiv:1811.01018

by: Hendi, Seyed Hossein
Abstract:
Considering de Rham-Gabadadze-Tolley theory of massive gravity coupled with (ghost free) higher curvature terms arisen from the Lovelock Lagrangian, we obtain charged AdS black hole solutions in diverse dimensions. We compute thermodynamic quantities in the extended phase space by considering the variations of the negative cosmological constant, Lovelock coefficients ($\alpha_{i}$) and massive couplings ($c_{i}$), and prove that such variations is necessary for satisfying the extended first law of thermodynamics as well as associated Smarr formula. In addition, by performing a comprehensive thermal stability analysis for the topological black hole solutions, we show in what regions thermally stable phases exist. Calculations show the results are radically different from those in Einstein gravity. Furthermore, we investigate $P-V$ criticality of massive charged AdS black holes in higher dimensions, including the effect of higher curvature terms and massive parameter, and find that the critical behavior and phase transition can happen for non-compact black holes as well as spherically symmetric ones. The phase structure and critical behavior of topological AdS black holes are drastically restricted by the geometry of event horizon. In this regard, the universal ratio, i.e. $\frac{{{P_c}{v_c}}}{{{T_c}}}$, is a function of the event horizon topology. It is shown the phase structure of AdS black holes with non-compact (hyperbolic) horizon could give birth to three critical points corresponds to a reverse van der Waals behavior for phase transition which is accompanied with two distinct van der Waals phase transitions. For black holes with spherical horizon, the van der Waals, reentrant and analogue of solid/liquid/gas phase transitions are observed.
          Thermodynamics of the FRW universe at the event horizon in Palatini f(R) gravity      Cache   Translate Page      
arXiv:1811.01529

by: Sefiedgar, A.S.
Abstract:
In an accelerated expanding universe, one can expect the existence of an event horizon. It may be interesting to study the thermodynamics of the Friedmann-Robertson-Walker (FRW) universe at the event horizon. Considering the usual Hawking temperature, the first law of thermodynamics does not hold on the event horizon. To satisfy the first law of thermodynamics, it is necessary to redefine Hawking temperature. In this paper, using the redefinition of Hawking temperature and applying the first law of thermodynamics on the event horizon, the Friedmann equations are obtained in f(R) gravity from the viewpoint of Palatini formalisn. In addition, the generalized second law (GSL) of thermodynamics, as a measure of the validity of the theory, is investigated.
          Analytical properties of the gluon propagator from truncated Dyson-Schwinger equation in complex Euclidean space      Cache   Translate Page      
arXiv:1811.01479

by: Kaptari, L.P.
Abstract:
We suggest a framework based on the rainbow approximation with effective parameters adjusted to lattice data. The analytic structure of the gluon and ghost propagators of QCD in Landau gauge is analyzed by means of numerical solutions of the coupled system of truncated Dyson-Schwinger equations. We find that the gluon and ghost dressing functions are singular in complex Euclidean space with singularities as isolated pairwise conjugated poles. These poles hamper solving numerically the Bethe-Salpeter equation for glueballs as bound states of two interacting dressed gluons. Nevertheless, we argue that, by knowing the position of the poles and their residues, a reliable algorithm for numerical solving the Bethe-Salpeter equation can be established.
          The strong coupling from $e^+e^-\to$ hadrons      Cache   Translate Page      
arXiv:1811.01829

by: Boito, Diogo
Abstract:
We use a new compilation of the hadronic $R$-ratio from available data for the process $e^+e^-\to$ hadrons below the charm mass to determine the strong coupling $\alpha_s$, using finite-energy sum rules. Quoting our results at the $\tau$ mass to facilitate comparison to the results obtained from similar analyses of hadronic $\tau$-decay data, we find $\alpha_s(m_\tau^2)=0.298\pm 0.016\pm 0.006$ in fixed-order perturbation theory, and $\alpha_s(m_\tau^2)=0.304\pm 0.018\pm 0.006$ in contour-improved perturbation theory, where the first error is statistical, and the second error combines various systematic effects. These values are in good agreement with a recent determination from the OPAL and ALEPH data for hadronic $\tau$ decays. We briefly compare the $R(s)$-based analysis with the $\tau$-based analysis.
          Emergent D-instanton as a source of Dark Energy      Cache   Translate Page      
arXiv:1811.01330

by: Singh, Deobrat
Abstract:
We revisit a non-perturbative formulation leading to a vacuum created gravitational pair of (33)-brane by a Poincare dual higher form U (1) gauge theory on a D4 -brane. In particular, the analysis has revealed a dynamical geometric torsion H 3 for an on-shell Neveu-Schwarz (NS) form on a fat 4-brane. We argue that a D-instanton can be a viable candidate to incorporate the quintessence correction hidden to an emergent (3 + 1)-dimensional brane universe. It is shown that a dynamical non-perturbative correction may be realized with an axionic scalar QFT on an emergent anti 3-brane within a gravitational pair. The theoretical tool provokes thought to believe for an extra instantaneous dimension transverse to our classical brane-universe in an emergent scenario. Interestingly a D-instanton correction, sourced by an axion on an anti 3-brane, may serve as a potential candidate to explain the accelerated rate of expansion of our 3-brane universe and may provide a clue to the origin of dark energy.
          Special Relativistic Magnetohydrodynamics with Gravitation      Cache   Translate Page      
arXiv:1811.01594

by: Noh, Hyerim
Abstract:
We present a fully nonlinear and exact perturbation formulation of Einstein's gravity with a general fluid and the ideal magnetohydrodynamics (MHD) without imposing the slicing (temporal gauge) condition. Using this formulation, we derive equations of special relativistic (SR) MHD in the presence of weak gravitation. The equations are consistently derived in the limit of weak gravity and action-at-a-distance in the maximal slicing. We show that in this approximation the relativistic nature of gravity does not affect the SR MHD dynamics, but SR effects manifest themselves in the metric, and thus gravitational lensing. Neglecting these SR effects may lead to an overestimation of lensing masses.
          Todos deseamos que Oumuamua sea una nave extraterrestre... pero la realidad es esta      Cache   Translate Page      

Al principio, los científicos lo catalogaron como un cometa, luego como un asteroide y finalmente, 'Oumuamua ha vuelto a los titulares después de que dos astrónomos de Harvard, Abraham Loeb y Shmuel Bialy, explicaran en un artículo subido a arXiv (un repositorio de pre-prints, estudios aún no publicados en una revista científica) que "no descartan" que el extraño objeto pudiera ser o formar parte de una nave extraterrestre.

etiquetas: oumuamua, nave, extraterrestre, astronomía, interestelar, harvard, ovni

»