The only problem with your analysis is that there is a lot of conjecture.
In my other life, I work in the SEO & Digital Marketing space and specialize in technical SEO. I also am an affiliate marketer. So let me give you a little perspective.
Your first point on keywords is bare, there's a reason for it.
The Google system doesn't need to "figure out" anything. It already has the largest crowd-sourced "figure outers" in the world - namely its searchers. They are the ones who search for these word combinations.
Now, G is a meticulous tracker. It tracks everything. And thanks to its other "Free" products viz. Google Analytics and Chrome - it can continue tracking.
So among the thousands of metrics, it tracks, let's analyze the base metrics first.
1. Volume - The number of times a particular keyword was searched.
2. CTR - Which links are clicked more.
3. Speed - Which site served the content the fastest, using the least amount of data.
4. Conversion - Which keywords are paying keywords and resulted in action on the end website (Forms filled, Add to Carts, Purchases)
5. Time Spent on Site, Pages Visited.
And finally, another keyword it tracks is "Intent".
The most common intents are
- Information
These are people looking for information on a subject
Example - "Symptoms of ED", "When to travel to Amsterdam?", "Best Porn Sites".
- Navigational
These are people who look for login pages of websites.
Example - "Sign up for ABN Credit Card", "US Visa Application Form" etc.
- Transactional
For affiliate marketers, e-commerce entities and SaaS companies - this is where the magic happens. These are people looking to buy stuff.
Example - "Cheapest Tube Script"
Beyond the above 3 main, there are more -
Specific Page Ones -
"German Wife Ravaged on Christmas Eve"
This is a tube site keyword. For a video someone likes. And wishes to access it.
This could come under Navigational. But for the sake of clarity, let's keep it as an outlier.
Google also harnesses NLP (Natural Language Processing) -
This is what tells it the difference between the use of the word "suspect" in
"The police found that the suspect had two penises" and "I suspect I may have two penises".
--------
Beyond this, Google has a product called "Adwords". The above metrics help Google decide the base price for a keyword (which is then inflated through a bidding war).
Now Google's tracking metrics far exceed beyond this. But for the sake of brevity, let's keep them limited till here.
Based on this, Google's paramount goal is to stay the "most relevant search engine".
A - It doesn't want to be gamed.
B - It doesn't want to serve low-quality pages. Because if people start finding crap on Google, they'll move to other search engines.
C - Because of its ulterior motive of taking over the world.
Since you mentioned "Big Data" - I'd like to point out that in Data Analytics - things happen through "Priority Buckets". A grouping mechanism of sorts.
What does that mean?
Google assimilates a set of keywords based on the above metrics and more. And then puts them in certain buckets.
This allows it to run controlled experiments on these particular buckets.
It also helps it to see the upheaval going on, place a cost on ranking, estimate traffic etc. for these buckets.
One keyword can be in several buckets.
So say -
Bucket A - High Traffic Keywords
Mesothelioma Lawyers
Viagra Cialis
Online Pharmacy No Prescription
Bucket B - Shady Websites
Viagra Cialis
Online Pharmacy No Prescription
Buy Fullz
(This is just an estimate. The buckets contain thousands, if not million keywords).
Now based on these, Google uses their proprietary algorithm and sometimes human intervention to rank websites. It sees what's going on, then readjusts the rankings, it keeps on doing it until the rankings stick with the winners.
It is the most complicated A/B Split testing ever.
Also, Buckets are not "Niche specific" or at the very least, they are "partly niche specific".
So for example, if you rank for "Lawyers in Baltimore" - you may see a change in rankings during an update.
But if you rank for "Property Dispute Baltimore Lawyer" - you may continue to rank.
-----
When you see a "Google Volatility Tracker" - their data is partially biased.
Example -
https://cognitiveseo.com/signals/ or
https://www.semrush.com/sensor/
Because these trackers cost a lot per month and the keywords being tracked here are mostly money keywords by agencies.
-----
And finally, to touch your mention of "real time". There is a "certain" realtime element to Google results. Meaning that Google indexes and ranks content within minutes of it being posted in some niches (Not talking about Google News).
Google has a LOT OF PROCESSING POWER. So much so that they are selling it through Google Cloud.
But think of Google as several small search engines, broken down in buckets. And not one big one. Every bucket behaves differently. Based on their internal risk markers.
The bucket you hit depends on your keyword search.