Creative agency Virtue introduces genderless voice Q to challenge biases in technology


Siri, Alexa, Google Assistant, Cortana and Bixby–almost all virtual assistants have something in common. Their default voices are women’s, though the role that plays in reinforcing gender stereotypes has been long documented, even inspiring the dystopian romance “Her.” Virtue, the creative agency owned by publisher Vice, wants to challenge the trend with a genderless voice called Q.

The project, done in collaboration with Copenhagen Pride, Equal AI, Koalition Interactive and thirtysoundsgood, wants technology companies to think outside the binary.

“Technology companies are continuing to gender their voice technology to fit scenarios in which they believe consumers will feel most comfortable adopting and using it,” says Q’s website. “A male voice is used in more authoritative roles, such as banking and insurance apps, and a female voice in more service-oriented roles, such as Alexa and Siri.”

To develop Q, Virtue worked with Anna Jørgensen, a linguist and researcher at the University of Copenhagen. They recorded the voices of five non-binary people, then used software to modulate the recordings to between 145-175 Hz, the range defined by researchers as gender neutral. The recordings were further refined after surveying 4,600 people and asking them to define the voices on a scale from 1 (male) to 5 (female).

Virtue is encouraging people to share Q with Apple, Amazon and Microsoft, noting that even when different options are given for voice assistants, they are still usually categorized as male or female. As the project’s mission statement puts it, “as society continues to break down the gender binary, recognizing those who neither identify as male nor female, the technology we create should follow.”

from TechCrunch

NASA releases the final panorama that Opportunity took on Mars


Before a Martian dust storm took out Opportunity in June 2018, the rover was able to capture hundreds of images that NASA has now released as a panorama. The 360-degree photo is composed of 354 images overall, taken by the rover’s Panoramic Camera (Pancam) from May 13th through June 10th. It shows the vehicle’s final resting place in Perseverance Valley located in Endurance Crater’s western rim. The rover lost touch with NASA in June after it reported the approaching storm that ultimately covered its solar panels with dust and rocks.

Over the next months after that, the agency made more than thousand attempts to contact the rover. On February 13th, 2019, it had no choice but to admit that Opportunity is gone to humanity. This panorama combines photos taken through three filters that capture images in different wavelengths — near-infrared, green and violet. Some parts are still black and white, because Pancam didn’t have time to take photos of them through the green and violet filters before the dust storm hit.

Opportunity retired over a decade later than planned — the mission launched in 2004 and was only supposed to last for 90 days. This magnificent panorama befits its excellent run as a source of data about the red planet.

"The [below] image is a cropped version of the last 360-degree panorama taken by the Opportunity rover’s Panoramic Camera (Pancam) from May 13 through June 10, 2018. The view is presented in false color to make some differences between materials easier to see."


Source: NASA

from Engadget

The meaning of Nginx and F5


F5 Networks took a dive yesterday after the company announced it was acquiring open-source web server NGINX. While media coverage of the deal was largely positive, the public markets appeared much more skeptical, as F5 stock finished the day down nearly 8%.

As most analyses noted, the logic behind the deal is clear. F5’s existing markets have continued to dry up, with just low single digit expansion expected year-over-year. On top of NGINX being one of the most widely used web servers in the world, NGINX gives NetOps-focused F5 an entrée into the DevOps market. As a result, F5 now has to fork over some cash to modernize its offering and find new avenues for growth. Plus, the acquisition provides F5 with exposure to the growing movement towards open source software.

Unfortunately for F5, the company’s stock is heavily owned by institutional holders and the tradeoffs and costs of the transaction hit areas where institutional investors are particularly sensitive.

You’re reading the Extra Crunch Daily. Like this newsletter? Subscribe for free to follow all of our discussions and debates.

First, investors love nothing more than when companies return cash to shareholders, primarily through buybacks or dividends. Unsurprisingly, investors were less than pleased when F5 announced it would be cutting its more than $1 billion share repurchase plan and would instead be using its cash to fund the NGINX deal.

Some investors and analysts were even more turned off by a purchase price they viewed as a bit pricey (with some estimating a mid-to-high 20s transaction multiple on 2018 sales), which meant F5 will have to extract larger financial benefits from the deal to reach attractive levels of return.

Though F5 expects the combined entity to gain market share and identify more than enough synergies to fuel returns in the long run, in the near term, the company announced that the deal would compress operating margins and earnings-per-share, while only modestly improving revenue growth. Diluting earnings-per-share in particular likely had a direct impact on the valuations spat out by investor models, since many at least in part use a multiple of year-ahead earnings to derive price targets.

And while many investor concerns seem largely technical or financial, several analysts expressed broader fears over the level of synergies and revenue growth F5 and NGINX will actually be able to generate. The synergy concerns from the investment community are actually fairly aligned with those expressed by some of the developer community.

Historically, open-source purists have typically viewed the involvement of for-profit entities as a fatal blow to open source platforms, based on the general assumption that financial and shareholder incentives will lead to proprietary licensing or other challenges. As a result, in addition to the normal integration risk seen in any M&A event, analysts expressed concerns over potential impacts to the combined entity’s ability to attract or retain dedicated open source customers or employees.

Fears that F5’s involvement in NGINX will deter investors seem a bit overblown, but the immediate harsh reaction of F5’s stock investors does nothing to quell fears that financial pressure may impact the existing NGINX model.

The divergent responses to F5’s deal seems indicative of a larger trend we’ve focused on, where long-standing tech powerhouses have seen growth stall after focusing on profitable business lines while ignoring emerging alternative models that have become the primary source of growth. Now, incumbents are having to cough over hefty sums just to play catch up and face a tough balancing act of angering investors and investing in their future.

~ Written by Arman Tabatabai

Ingrid Burrington’s Networks of New York

Photo by Smith Collection/Gado/Getty Images

In book news, I finished Networks of New York, which is a slim book on the infrastructure that powers New York City’s internet and surveillance infrastructure. Burrington has made a name for herself covering the politics and challenges of the networking layers of the internet, and this is both a reference and a sort of travel field guide to this technology that looms around us every day but we often overlook.

That all said, it is really slim, with a few details of mergers and acquisitions of telecom companies strewn in between pages of figures depicting manhole covers. As an exemplar of short books, I think it is an experimental contribution, but I can’t recommend the book if you really want to understand how internet plumbing works. A better book (albeit less about the internet) is Kate Ascher’s The Works: Anatomy of a City.

~ Written by Danny Crichton


To every member of Extra Crunch: thank you. You allow us to get off the ad-laden media churn conveyor belt and spend quality time on amazing ideas, people, and companies. If I can ever be of assistance, hit reply, or send an email to

This newsletter is written with the assistance of Arman Tabatabai from New York

You’re reading the Extra Crunch Daily. Like this newsletter? Subscribe for free to follow all of our discussions and debates.

from TechCrunch

Light waves allow scientists to 3D print with multiple materials


3D printing can already create sensors for NASA rovers, rocket engines, safer football helmets, dentures. Name it, and it seems like it can be 3D printed. But the technology is still pretty limited, partly because most 3D printing systems can only make parts made of one material at a time. Now, researchers at the University of Wisconsin have discovered a way to use light to 3D print with more than one material.

Today, most 3D printers that lay down multiple materials use separate reservoirs. This new chemistry-based approach uses a single reservoir with two monomers (the molecules that are joined together to create a 3D-printed substance). Then, either ultraviolet or visible light is used to link those monomers together. Depending on which light is used, the final product will have different properties, like stiffness. The researchers hope this single-reservoir approach could be more practical than using multiple reservoirs of material.

This isn’t the first time researchers have used light to control 3D printing, and this concept still needs some fine tuning. At the moment, researchers have only achieved putting hard materials next to soft material. And it will take time before scientists know which monomers work together in a single reservoir, but they hope "wavelength-controlled, multi-material 3D printing" will make more complex objects possible.

Source: University of Wisconsin – Madison

from Engadget

How to Build Your Own Search Ranking Algorithm with Machine Learning by @CoperniX


“Any sufficiently advanced technology is indistinguishable from magic.” – Arthur C. Clarke (1961)

This quote couldn’t apply better to general search engines and web ranking algorithms.

Think about it.

You can ask Bing about mostly anything and you’ll get the best 10 results out of billions of webpages within a couple of seconds. If that’s not magic, I don’t know what is!

Sometimes the query is about an obscure hobby. Sometimes it’s about a news event that nobody could have predicted yesterday.

Sometimes it’s even unclear what the query is about! It all doesn’t matter. When users enter a search query, they expect their 10 blue links on the other side.

To solve this hard problem in a scalable and systematic way, we made the decision very early in the history of Bing to treat web ranking as a machine learning problem.

As early as 2005, we used neural networks to power our search engine and you can still find rare pictures of Satya Nadella, VP of Search and Advertising at the time, showcasing our web ranking advances.

This article will break down the machine learning problem known as Learning to Rank. And if you want to have some fun, you could follow the same steps to build your own web ranking algorithm.

Why Machine Learning?

A standard definition of machine learning is the following:

“Machine learning is the science of getting computers to act without being explicitly programmed.”

At a high level, machine learning is good at identifying patterns in data and generalizing based on a (relatively) small set of examples.

For web ranking, it means building a model that will look at some ideal SERPs and learn which features are the most predictive of relevance.

This makes machine learning a scalable way to create a web ranking algorithm. You don’t need to hire experts in every single possible topic to carefully engineer your algorithm.

Instead, based on the patterns shared by a great football site and a great baseball site, the model will learn to identify great basketball sites or even great sites for a sport that doesn’t even exist yet!

Another advantage of treating web ranking as a machine learning problem is that you can use decades of research to systematically address the problem.

There are a few key steps that are essentially the same for every machine learning project. The diagram below highlights what these steps are, in the context of search, and the rest of this article will cover them in more details.

Web Ranking as a Machine Learning ProblemWeb Ranking as a Machine Learning Problem

1. Define Your Algorithm Goal

Defining a proper measurable goal is key to the success of any project. In the world of machine learning, there is a saying that highlights very well the critical importance of defining the right metrics.

“You only improve what you measure.”

Sometimes the goal is straightforward: is it a hot dog or not?

Even without any guidelines, most people would agree, when presented with various pictures, whether they represent a hot dog or not.

And the answer to that question is binary. Either it is or it is not a hot dog.

Other times, things are quite more subjective: is it the ideal SERP for a given query?

Everyone will have a different opinion of what makes a result relevant, authoritative, or contextual. Everyone will prioritize and weigh these aspects differently.

That’s where search quality rating guidelines come into play.

At Bing, our ideal SERP is the one that maximizes user satisfaction. The team has put a lot of thinking into what that means and what kind of results we need to show to make our users happy.

The outcome is the equivalent of a product specification for our ranking algorithm. That document outlines what’s a great (or poor) result for a query and tries to remove subjectivity from the equation.

An additional layer of complexity is that search quality is not binary. Sometimes you get perfect results, sometimes you get terrible results, but most often you get something in between.

In order to capture these subtleties, we ask judges to rate each result on a 5-point scale.

Finally, for a query and an ordered list of rated results, you can score your SERP using some classic information retrieval formulas.

Discounted cumulative gain (DCG) is a canonical metric that captures the intuition that the higher the result in the SERP, the more important it is to get it right.

2. Collect Some Data

Now we have an objective definition of quality, a scale to rate any given result, and by extension a metric to rate any given SERP. The next step is to collect some data to train our algorithm.

In other words, we’re going to gather a set of SERPs and ask human judges to rate results using the guidelines.

We want this set of SERPs to be representative of the things our broad user base is searching for. A simple way to do that is to sample some of the queries we’ve seen in the past on Bing.

While doing so, we need to make sure we don’t have some unwanted bias in the set.

For example, it could be that there are disproportionately more Bing users on the East Coast than other parts of the U.S.

If the search habits of users on the East Coast were any different from the Midwest or the West Coast, that’s a bias that would be captured in the ranking algorithm.

Once we have a good list of SERPs (both queries and URLs), we send that list to human judges, who are rating them according to the guidelines.

Once done, we have a list of query/URL pairs along with their quality rating. That set gets split in a “training set” and a “test set”, which are respectively used to:

  • Train the machine learning algorithm.
  • Evaluate how well it works on queries it hasn’t seen before (but for which we do have a quality rating that allows us to measure the algorithm performance).
Training & Test Set of Labeled Query/URL PairsTraining & Test Set of Labeled Query/URL Pairs

3. Define Your Model Features

Search quality ratings are based on what humans see on the page.

Machines have an entirely different view of these web documents, which is based on crawling and indexing, as well as a lot of preprocessing.

That’s because machines reason with numbers, not directly with the text that is contained on the page (although it is, of course, a critical input).

The next step of building your algorithm is to transform documents into “features”.

In this context, a feature is a defining characteristic of the document, which can be used to predict how relevant it’s going to be for a given query.

Here are some examples.

  • A simple feature could be the number of words in the document.
  • A slightly more advanced feature could be the detected language of the document (with each language represented by a different number).
  • An even more complex feature would be some kind of document score based on the link graph. Obviously, that one would require a large amount of preprocessing!
  • You could even have synthetic features, such as the square of the document length multiplied by the log of the number of outlinks. The sky is the limit!
Preparing Web Data for Machine LearningPreparing Web Data for Machine Learning

It would be tempting to throw everything in the mix but having too many features can significantly increase the time it takes to train the model and affect its final performance.

Depending on the complexity of a given feature, it could also be costly to precompute reliably.

Some features will inevitably have a negligible weight in the final model, in the sense that they are not helping to predict quality one way or the other.

Some features may even have a negative weight, which means they are somewhat predictive of irrelevance!

As a side note, queries will also have their own features. Because we are trying to evaluate the quality of a search result for a given query, it is important that our algorithm learns from both.

4. Train Your Ranking Algorithm

This is where it all comes together. Each document in the index is represented by hundreds of features. We have a set of queries and URLs, along with their quality ratings.

The goal of the ranking algorithm is to maximize the rating of these SERPs using only the document (and query) features.

Intuitively we may want to build a model that predicts the rating of each query/URL pair, also known as a “pointwise” approach. It turns out it is a hard problem and it is not exactly what we want.

We don’t particularly care about the exact rating of each individual result. What we really care about is that the results are correctly ordered in descending order of rating.

A decent metric that captures this notion of correct order is the count of inversions in your ranking, the number of times a lower-rated result appears above a higher-rated one. The approach is known as “pairwise”, and we also call these inversions “pairwise errors”.

An Example of Pairwise ErrorAn Example of Pairwise Error

Not all pairwise errors are created equal. Because we use DCG as our scoring function, it is critical that the algorithm gets the top results right.

Therefore, a pairwise error at positions 1 and 2 is much more severe than an error at positions 9 and 10, all other things being equal. Our algorithm needs to factor this potential gain (or loss) in DCG for each of the result pairs.

The “training” process of a machine learning model is generally iterative (and all automated). At each step, the model is tweaking the weight of each feature in the direction where it expects to decrease the error the most.

After each step, the algorithm remeasures the rating of all the SERPs (based on the known URL/query pair ratings) to evaluate how it’s doing. Rinse and repeat.

Depending on how much data you’re using to train your model, it can take hours, maybe days to reach a satisfactory result. But ultimately it will still take less than a second for the model to return the 10 blue links it predicts are the best.

The specific algorithm we are using at Bing is called LambdaMART, a boosted decision tree ensemble. It is a successor of RankNet, the first neural network used by a general search engine to rank its results.

5. Evaluate How Well You Did

Now we have our ranking algorithm, ready to be tried and tested. Remember that we kept some labeled data that was not used to train the machine learning model.

The first thing we’re going to do is to measure the performance of our algorithm on that “test set”.

If we did a good job, the performance of our algorithm on the test set should be comparable to its performance on the training set. Sometimes it is not the case. The main risk is what we call “overfitting”, which means we over-optimized our model for the SERPs in the training set.

Let’s imagine a caricatural scenario where the algorithm would hardcode the best results for each query. Then it would perform perfectly on the training set, for which it knows what the best results are.

On the other hand, it would tank on the test set, for which it doesn’t have that information.

Now Here’s the Twist…

Even if our algorithm performs very well when measured by DCG, it is not enough.

Remember, our goal is to maximize user satisfaction. It all started with the guidelines, which capture what we think is satisfying users.

This is a bold assumption that we need to validate to close the loop.

To do that, we perform what we call online evaluation. When the ranking algorithm is running live, with real users, do we observe a search behavior that implies user satisfaction?

Even that is an ambiguous question.

If you type a query and leave after 5 seconds without clicking on a result, is that because you got your answer from captions or because you didn’t find anything good?

If you click on a result and come back to the SERP after 10 seconds, is it because the landing page was terrible or because it was so good that you got the information you wanted from it in a glance?

Ultimately, every ranking algorithm change is an experiment that allows us to learn more about our users, which gives us the opportunity to circle back and improve our vision for an ideal search engine.

More Resources:

Image Credits

In-post Images: Created by author, March 2019

from Search Engine Journal

The legendary and indescribable Dwarf Fortress goes non-ASCII and non-free for the first time


Among the growing field of indie games, one truly stands alone: Dwarf Fortress. The unbelievably rich and complex and legendarily user-unfriendly title has been a free staple of awe and frustration for years. But the developers, in a huge shift to the status quo, have announced that the game will not only soon have a paid version on Steam — it’ll have… graphics.

It may be hard for anyone who isn’t already familiar with the game and community to understand how momentous this is. In the decade and a half this game has been in active, continuous development, perhaps the only thing that hasn’t changed about the game is that it is a maze for the eyes, a mess of alphanumerics and ASCII-based art approximating barrels, dwarves, goblins, and dozens of kinds of stone.

You know in The Matrix where they show how the world is made up of a bunch of essentially text characters? It’s basically that, except way more confusing. But you get a feel for it after a few years.

So when developers Tarn and Zach Adams announced on their Patreon account that they were planning on ditching the ASCII for actual sprites in a paid premium version of the game to be made available on Steam and indie marketplace minds were blown. Of all the changes Dwarf Fortress has undergone, this is likely the most surprising. Here are a few screenshots compared with the old ASCII graphics:

Not that you couldn’t get graphics in other ways — gamers aren’t that masochistic. There are “tile packs” available in a variety of sizes and styles that any player can apply to the game to make it easier to follow; in fact, the creators of two popular tilesets, Meph and Mike Mayday, were tapped to help make the “official” one, which by the way looks nice. Kitfox Games (maker of the lovely Shrouded Isle) is helping out as well.

There are plenty of other little mods and improvements made by dedicated players. Many of those will likely be ported over to Steam Workshop and made a cinch to install — another bonus for paying players.

Now, I should note that I in no way find this bothersome. I support Tarn and Zach in whatever they choose to do, and at any rate the original ASCII version will always be free. But what does disturb me is the reason they are doing this. As Tarn wrote on Patreon in a rare non-game update:

We don’t talk about this much, but for many years, Zach has been on expensive medication, which has fortunately been covered by his healthcare. It’s a source of constant concern, as the plan has changed a few times and as the political environment has shifted. We have other family health risks, and as we get older, the precariousness of our situation increases; after Zach’s latest cancer scare, we determined that with my healthcare plan’s copay etc., I’d be wiped out if I had to undergo the same procedures. That said, crowdfunding is by far our main source of income and the reason we’re still here. Your support is still crucial, as the Steam release may or may not bring us the added stability we’re seeking now and it’s some months away.

It’s sad as hell to hear that a pair of developers whose game is as well-loved as this, and who are making a modest sum via Patreon can still be frightened of sudden bankruptcy on account of a chronic medical condition.

This isn’t the place for a political debate, but one would hope that the creators of what amounts to a successful small business like this would not have to worry about such things in the richest country in the world.

That said, they seem comfortable with the move to real graphics and the addition of a more traditional income stream, so the community (myself included) will no doubt see the sunny side of this and continue to support the game in its new form.

from TechCrunch

Insta360 Evo captures 180- and 360-degree content for VR headsets


Following the powerful One X 360 camera (and some fun updates), Insta360 is now back with a rather eccentric device that aims to make better use of your VR headsets. The new Insta360 Evo is a dual-mode camera that captures 360 content when folded into a cube, as well as 180 3D content when unfolded (remember Lenovo’s VR camera?). What’s more, this convertible camera can stream freshly captured content directly to the likes of HTC Vive Focus, Oculus Go and Samsung Gear VR, which will likely motivate users to create more VR content.

The Evo is essentially the One X morphed into a foldable form factor, yet at $419.99, it only costs about $20 more. You get the same 5.7K video resolution and 18-megapixel still resolution, along with handy features like FlowState stabilization, TimeShift hyperlapse plus WiFi hotspot connectivity for its dedicated iOS or Android app.

The most obvious difference here is the new 180-degree capture mode, which is automatically enabled when you unfold the Evo and lock its position using the top latch. Then it’s just a matter of toggling either photo or video capture before you start shooting, and you can manage all of these on the camera itself or via its companion app over a Wi-Fi connection.

According to CEO JK Liu, his team came up with this versatile device to offer an extra means of capturing one’s special moments, because depending on the content, sometimes 180-degree capture works better. I can imagine how this may benefit parents who want to remember what it feels like gazing at their newborn baby, for instance. Liu added that the Evo is also fit for capturing sporting activities in first-person view, the depth perception of which contributes to extra immersiveness.

Insta360 Evo

Like before, you can easily download or stream the camera’s content in the app, and then share it on Facebook (except for 3D photos at the moment), YouTube or Insta360’s own hosting service. But for those who want to get immersive right away, the company is now offering two more ways to enjoy your own VR content.

First of all, owners of the Oculus Go, Samsung Gear VR and HTC Vive Focus (support due later this month) will be able to connect their headsets directly to the Evo over WiFi, then they can transfer files over or even stream videos within the dedicated VR app, as opposed to having to manually copy (and maybe convert) files using a PC or an OTG accessory. I can imagine how this newfound convenience will make the Evo a useful learning tool for VR filmmakers as well as newbies.

Insta360 Evo

The second option is Holoframe, a clever glasses-free 3D solution using eye tracking, and it’s co-developed by Samsung spin-off, Mopic. This requires an optional $29.99 transparent smartphone case with a special filter on the back, and to use it, simply flip the case around so that it’s covering the screen, and then toggle Holoframe playback in the app. For now, this case is only available for the iPhone X, XS, XS Max plus XR, with Samsung’s Galaxy S8, S8+, S9, S9+ and Note 8 to be supported at a later date.

After an initial calibration, I was able to use Holoframe to view my 3D content on my iPhone XS Max, and the results were better than I expected — think of it as a much sharper version of what you might have seen on the HTC Evo 3D and the LG Optimus 3D from 2011. There’s still room for improvement, though, and Liu added that his team is already working on an update which will enable dynamic 3D effect — it’ll look as if you’re peering through a window, especially if you tilt the phone around.

There’s technically a third option, too, as the Insta360 Evo comes with a "3D Viewer," which is just a little foldable VR goggle attachment for any phone. This is obviously a quick and dirty way to play with the VR playback mode in the Evo’s app, but alternatively, you can also pop your phone into a Cardboard-like enclosure just like in the good old days.

Sadly, I don’t have an Oculus Go nor a GearVR headset to try the new Insta360 VR app, so I’ll have to wait for the app to hit the HTC Vive Focus later this month. For the rest of you, the Insta360 Evo is already available for purchase, and in addition to the 3D Viewer, the camera comes bundled with a mini-tripod grip (not pictured here) plus a protective pouch. Just make sure you also have a microSD card rated with UHS-I V30 speed, and you should be good to go.

Source: Insta360

from Engadget