Tuesday , August 20 2019
Home / chile / What Alexa, Assistant and Siri are really listening to, storing and processing

What Alexa, Assistant and Siri are really listening to, storing and processing

Voice assistants are in vogue. Wise speakers from Amazon or Google have become one of the potential gifts for this Christmas, and now both Google's assistant and Alexa or Siri speak Spanish, it's easy to ask the question: What do they hear and keep everything we talk about?

Confidentiality is again challenged by some solutions, some of which think we are "spies," but as we will see below, all companies responsible for developing voice assistants have taken into account this section and are very careful about the data being processed. and how they are treated.

Machines want to talk to us

Smart speakers they're a little scared. There are no longer any anecdotal errors that led to these sinister laughs, but because of their use it is not clear what kind of security affects the interaction with these devices.

We are around technology that listens to us, but not satisfied with the transfer of mobile mobile devices everywhere, now also smart speakers accompany us at home, and this is not the only element in which the voice assistants are. Smart clocks and various solutions that we could include in the segment of the Internet can also use these changes.

The problem (if we do it in this way) is the feeling that these products always listen to us, which endangers our data security. and our privacy. This happened, for example, with this private conversation that Amazon Echo shared random contact error, but this kind of situation is an exception rather than a rule.

In this way it is like voice support always looked at us, although in fact the producers at our disposal have enough information to understand what these assistants are doing with these data.

It acts as a voice assist

Similarly, smart speakers and other products that use voice assistant capacity also work: they are activated with the name that awakens them, which means that these assistants are actively awaited: they always listen, but they pay attention only from the moment they hear this esoecial name (Hei Siri, OK Google, etc.) Is a small activation phrase.

Google assistant When I ask Google Assistant on my Android phone, if he's spit on me (he asks the question, but he answers as if it were a question) his answer was that.

To carry out this active standby function, these assistants constantly listen to us and make small entries with the words they hear and try to recognize. If this activation word or phrase is activated, the device saves the recording to handle it. but if this is not the case, this entry is fixed.

When we activate the voice assistant, yes, the data transmission starts, and here it is important to point out that again we are dependent on the cloud: this conversation and these questions or orders are not processed on the device as such, but transmitted to the server that interprets them, processes them and provides an answer that our assistant simply cares about making us loud (of course, synthesized).

So our voice it is not stored locally on the device, but it comes to servers that manufacturers of these devices and developers of these aids (Google, Apple, Amazon, Microsoft) have prepared all this huge language recognition work.

But What data is actually transferred to these servers? What do these companies do with these data? What can we do with it? What we think is important to clarify, and we will do it separately for each of the four major voice assistants in the current market.

Google assistant

What do you collect and for what purpose?

It is important to note that the voting aid is available, for example, in 2007 Google does not initially record all our conversations. Instead of helping an assistant to "listen to small snippets" from a few seconds to determine if the activation phrase has been spoken. If not, these passages are deleted and this information is not left on your device until you hear an activation phrase. "

Google start

Google Home Help is informing us about the information collected by the Google Home Appliance. In fact, there is a special section that shows the collected data and it They are divided into three groups.

The first part contains data intended to improve performance and reliability from the device, such as the stability of the Wi-Fi network, detection success or latency percentages.

The second group is that which includes usage statistics, such as the number of interactions in the device, and what buttons are we pushing for assistants. We also collect the length of the media sessions and what applications we use in these sessions, but here it's important to emphasize that Google "We do not collect information about the content being played, but it's possible that the media service provider tells us ".

The third aggregate data group is information about the hardware model and the version of software we use, but also about active processes to determine Possible causes of failures error messages.

This help also explains how the Google Assistant, which is integrated on the Google homepage, can access search history, "to offer you better and more helpful responses," and although you can specify your address to Google, you can not do this; in this case the system "will find your approximate location based on your IP address and other signals to identify alerts in the correct time zone and give you weather and traffic information. "

The company collects data to make its services "faster, smarter, more relevant, and more useful to users," and apparently this activity with these assistants allows Google to begin learning "with time, offer better and personalized responses and suggestions"

Where are these data stored and what control do we have over these data?

This way, this data that is transmitted to Google's servers is available at their data centers, where they are stored indefinitely unless we delete them manually.

Google Audio

Right here are the tools that Google offers to control this activity and data management. In my activity we have Full control panel where we can learn about the information Google gives us about how we use our services, including, of course, the voice assistant.

In this panel, we can find all the audio clips recorded by our requests to these assistants by filtering the results using those that only fit "Voice and Audio". It will have to find our phrase records, which we can remove together with any other information we do not want to store on these servers.

Alexa Amazon

What do you collect and for what purpose?

Like other visitors, Alexa collects conversations, requests, and voice commands. Amazon registers and processes this information also in some cases may be shared with third parties.


This voice assistant actually begins to record "part of a second of audio" before the word or phrase is activated (or we press a button that activates the helper) and this is when it issues this entry on Amazon servers.

In Amazon, they say that they use the Alexa-based device to keep these records "to improve the accuracy of the results provided and to improve our services." Like other services, "Deleting these devices can reduce your experience"

Where are these data stored and what control do we have over these data?

Amazon is one of the most important infrastructures in the world at the level of servers and data centers: it is not in vain to divide it Amazon Web Services This is one of your business keys.


Any Alexa user can access these voices from Alexa's application (in the privacy section) or at the Amazon web site address. From there, it's possible to remove these entries, although processing these requests allows us to continue to review and reproduce the voice recordings.

One of the features of this control panel is that we can also control the permissions we have given to other services and applications that have been created with Alexa. In this place, playing "Skills" these additional abilities that Amazon has been managing for a long time to provide greater versatility for this voice assistant.

In these preferences, we may also impose additional restrictions on the use of assistance. As devices can confuse some of the words we say while we talk and activate them when they are discovered, regardless of the context, we can force Alexa to activate only when we press the physical activation button.

We can also activate warning signal which allows us to know when the record is starting and ending and even "shutting down" the device, although obviously it will not allow us to take advantage of it.

Siri Apple

What do you collect and for what purpose?

Siri was the first voice assistant to appear in bulk in 2015 thanks to its integration into the iPhone. An assistant collects and uses information that we have on our mobile phone, such as our name or our contacts.


If we also have localization services enabled, this information can be sent with the request we provide to the assistant to make the answer more accurate.

Apple also points out that some Siri features are needed for "Real-time data entry from Apple servers ", which, for example, would allow Siri to compile both our current location and our destination if, for example, we ask for a route between two points.

Where are these data stored and what control do we have over these data?

When we talk to Siri, these commands are sent to Apple servers for analysis. In this process Apple assigns a random number to this entry, which combines our voice files.

Apple privacy

Six months after this entry or, if we deactivate Siri, Apple "disconnects" this random number from our record that essentially makes it Eliminate the association that existed. From now on, these files are saved for another 18 months, because Apple can use them to test and improve their products.

This record processing also makes the Apple case different, allowing it to control the user. We can turn off location services disable Siri or disconnect active standby so that the assistant will only work with a physical command.

However, these entries do not have access, as is the case, for example, with Google Assistant or Amazon Echo. We can, yes, delete the history of the voice interactions of all Apple servers, but they will have it for us. deactivates voice dictation from our device settings.

However, this is possible request Apple for all data What's in store for us is something from the Apple ID page. If we come forward, we will ask you to answer two security questions – we will be able to access the "Manage Your Data and Your Privacy" section at the bottom of the page, which will direct us to the specific Apple page in this section.

From there, we can already request a copy of our data, although they do not include audio files that we can expect to recover: the processing of the above files makes obvious these entries can not be recovered.

Microsoft Cortana

What do you collect and for what purpose?

Microsoft's development began its journey on mobile devices, but Windows Mobile and Windows 10 failed to get smartphones out of Cortana have made a leap for desktops – It's integrated with Windows 10 – and some smart speakers.


Using Cortana, we collect information about our device, our Microsoft services and third-party services with which we connect through Cortana. Microsoft says that "Cortana does not use the data with which you share it so you can advertise"

As stated in the Cortana Privacy Policy, the data collected includes browsing history, calendar, contacts, location history, or even – and this is something worrying –news, programs and communications content and communications history"

If we use Cortana as part of a Windows session with our account on this platform, these records will be made linked to that account, and from this we can control the aggregated data.

Where are these data stored and what control do we have over these data?

Microsoft also has a large server infrastructure –Azura is an increasingly important platform about the Redmond company in which they store all these aggregated data.


If we want to access the data that Microsoft has stored about us through Cortana, we can do this from our Windows team. In the settings, we can go to "Cortana-> Permissions and History", which will allow us – if we regularly use Cortana and we have applied, this is not my case – "change what Cortana is about for me in the cloud."

We can access all of these Microsoft Voice Data by going to our account privacy panel, which will lead us to a range of options, among which will be Voice Search and Interaction.

Microsoft, they explain that "all your data is not displayed" in the panel because the company "regularly removes data that is no longer needed for our systems"From this panel, we can also control which third-party services we give access to connect to Cortana, for example.

The control panel provided by Microsoft with Cortana is more complete than the other alternatives, and to a certain extent allows you to maintain a certain balance between the functions of voice assistant and the data it obtains during our use.

Conclusion: visitors are listening, but the user has control (if he wants to)

The services and products associated with the voice assistant avalanche makes suspected of security and privacy that the offer of these devices is inevitable.


As is already the case with many other services that collect data while browsing the internet, this goal is collect more and more information Users also have access to these voice assistants.

Here, like many other scenarios that affect the use of our common technology, it's important to get information about what is collected with these services, but also to find out that enjoying These benefits include some commitments. Each of them chooses whether the victim compensates him and if this data collection is not problematic.

Whether or not it is, fortunately, we have a time when technology companies are forced to significantly improve their own tools that allow users to access the collected data, and this is no exception for visitors.

Apple does not offer such direct control, but its approach is valid, and just three other companies – Google, Microsoft and Amazon – are likely to create "data fears" that appear in many of its products and services. They do this because certain data depends on the specific competitive advantage – for example, more effective advertising, but not new. which are exaggerated avid in this collection segment.

The transparency of these services is increasing, although users should always have greater or lesser monitoring of this data collection. Some data that, let us not forget, except for Apple, remains unchanged on these technology servers. We can erase it when we want it, but that's why we have to be proactive. This is another very different war, but to know how much we are exposed is a good starting point.

Source link