A little bit of introduction this time: I use Whatsapp browser interface, and I hate audio messages. Since one thing I really like of being a developer is the ability to solve stupid problems with stupid solutions, I finally decided to write a way automatically transcribe those damn messages. Since I receive messages in Italian, and since I needed the transcription to be fairly accurate, I went with Google Cloud API for their Speech2Text service since (although I hate GCP documentation) since:
- AWS and IBM services are in my experience less precise.
- Mozilla Deepspeech seems cool, but could not find an Italian trained model.
- No way I would do that on my own. This was made because I am too lazy to listen!
All of this can be found in my Github repo whatsapp-audio-transcribe.
I already knew of bookmarklets, but I never wrote one. In order to do this project, I knew I needed something to inject some JS code in my Whatsapp page, since Whatsapp does not offer some API to use. And I needed to either write a browser extension, or use a bookmarklet. Since the first one was longer to develop (and also had other problems that will be clear later) I decided to write a bookmarklet.
This was perfect for me, since the audio of audio messages is a blob that I need to fetch from a
whatsapp.com endpoint, and since the
window.fetch written in the bookmarklet originates from
whatsapp.com itself, it worked flawlessly.
Then the bookmarklet needed to send the audio data (converted in base64) to the Google Cloud API somehow, and show me the results.
This is a security header that a server can return when responding with some html. I had never heard of it before, but I would need to work around it in order to make this project work.
This header contains a policy that restricts the trusted origins for scripts, resources and such. A policy can be very simple like
default-src 'self' (browser will only accept content from the same origin as the website) or really complex, like the whatsapp one:
default-src 'self' data: blob: *;script-src *.facebook.com *.fbcdn.net *.facebook.net *.google-analytics.com *.virtualearth.net *.google.com 127.0.0.1:* *.spotilocal.com:* 'unsafe-inline' 'unsafe-eval' blob: data: 'self' https://ajax.googleapis.com https://api.search.live.net https://maps.googleapis.com https://www.youtube.com https://s.ytimg.com;style-src data: blob: 'unsafe-inline' * 'self' https://fonts.googleapis.com;connect-src *.facebook.com facebook.com *.fbcdn.net *.facebook.net *.spotilocal.com:* wss://*.facebook.com:* https://fb.scanandcleanlocal.com:* attachment.fbsbx.com ws://localhost:* blob: *.cdninstagram.com 'self' chrome-extension://boadgeojelhgndaghljhdicfkmllpafd chrome-extension://dliochdbjfkdbacpmhlcpmleaejidimm https://*.whatsapp.net https://www.facebook.com https://*.giphy.com https://*.tenor.co https://crashlogs.whatsapp.net/wa_clb_data https://crashlogs.whatsapp.net/wa_fls_upload_check https://www.bingapis.com/api/v6/images/search https://*.google-analytics.com wss://*.web.whatsapp.com wss://web.whatsapp.com https://dyn.web.whatsapp.com;font-src data: 'self' https://fonts.googleapis.com https://fonts.gstatic.com;img-src * data: blob:;media-src 'self' https://*.whatsapp.net https://*.giphy.com https://*.tenor.co https://*.cdninstagram.com https://*.streamable.com https://*.fbcdn.net blob: mediastream:;child-src 'self' blob:;frame-src https://www.youtube.com;
This one is harder to read, but as you can see it accepts only content from some specific domains. This is not a problem for my bookmarklet, since (as I wrote before) its origin is ‘self’ but I needed to send the audio data to google in order to have it transcribed. Since the policy contains a
connect-src directive, my script needed to comply to it in order to be able to use any JS API that lets me send data outside the page (
Of course, Google Cloud endpoint was outside the scope, and I feared I needed to write something on my
hosts file In order to intercept something sent to one of the allowed domains, but I was really lucky since
ws://localhost:* appears in the list. I think that Whatsapp devs simply forgot to exclude this one.
I only needed to send the data via ws to a local server, then to google, then back to the page and render the results somehow! I ended up using a simple
window.alert since I’m too lazy to write something beautiful, but it works and I’m really happy!
Alkaline environment accelerates Maillard reactions
This might look strange as it’s not related to my dev job, but since I used to study Biotechnologies before becoming a developer, I still like to read about chemistry (that I really loved), and I also like eating and cooking.
For those that do not know, Maillard reactions are reactions that occurs between (some) sugars and aminoacids, and that generates a plethora of molecules, some of which are the cause of browned food peculiar taste and smell. They are of course really important while cooking, since some food (steak, bread, onion) needs for those reactions to happen while cooking in order to get a better taste.
I’m currently reading a book on the chemistry behind meat cooking, and I found out that an alkaline environment favours these reactions, while an acidic one will hinder them. That’s really important to know when cooking something, since a small dose of baking soda (an alkaline compound) will help for example when cooking onions to get a brownish, sweet paste out of them far more quickly. Also, never to add acidic ingredients (lemon, tomato) to something that I need to undergo Maillard reaction, since it will slow it down.