I build a script with scapy to capture probe requests in a monitornig wi-fi interface. I successfully capture the requests, and some of the SSIDs contained in them. But most of the networks stored in the phone don't get broadcasted. And there isn't a clear pattern of why this happens. Some phones don't broadcast ssids at all.
I'm trying to find an explanation for the reasoning behind this behaviour, but haven't found any, apart that the hidden networks should be broadcasted in order for the phone to connect to them, but even that is not true, and most of the broadcasted ones are visible.
Another behaviour is the iPhones, that only seem to broadcast the network that they are connected to, and nothing else. (no network -> no SSIDs).
I have tried putting the interface in various channels, and results vary on the broadcasted networks, but the great majority of the saved ones in the device still aren't broadcasted.
Is there a reason behind this? Or a way to force the device to broadcast them all?
You seem to assume that the phone would do a probe request for each and every known network, permanently.
This is not the case - and not just for phone, but in general. Quoting the Wi-Fi Alliance[*]:
So this is entirely application/OS dependent if
the phone STA do an active scan, sending probe requests,
or just seat there listening for beacons (or doing nothing at all).
In my remembering - it's been a few years I didn't worked/looked at Android code, so it may have change - Android will not do an active scan, and thus will not send probe request to known SSID, unless you're in the Wi-Fi networks setting screen. It will just listen to beacons.
There are some Wi-Fi 802.11 design rationale behind this:
STA are supposed to be mobile. After all, if you're not moving from time to time, there's not much point in using Wi-Fi (except marketing or laziness, and of course smartphones changed that), you might as well get wired.
...if you're mobile, it's reasonable to think you're running on a battery,
And so you want to save battery life: so you'll rather do passive scans listening to beacons rather than active scan sending probe request, because this uses less power.
This idea of power saving alternative capabilities is spread all other the place in 802.11 design, hidden under carpet, when you're a STA.
So it is fully OS stack/application dependent from the STA if it 1/ just listen to beacons /2 actively send probe-request for every know AP 3/ send a broadcast probe-request, and also if it do so in a continuous manner, or periodically, or depending if it's in a know state (ex screen ON, and user going to the Wi-Fi networks setting screen).
Now there may be some other considerations, like some regional regulations that mandate that you first listen to beacons to decide if you can or cannot use some channels. But the main point is above.
*:
http://www.wi-fi.org/knowledge-center/faq/what-are-passive-and-active-scanning
EDIT:
On the programming side:
1/ What you seem to have is an IOP (interoperability) problem, because you expect a specific behavior from STA regarding scanning active vs passive and the involved probe-requests, and this is not how it works in the real world. Depending on your application final main goal, this may be a flawn in the design - or just a minor nuisance. You may want to restrict yourself to some specific device's brand, or try to cover all cases, which has a development cost.
2/ ...OR you were just surprised by your observations, and look for an explanation. In such case of surprising results, it goes without saying: go straight to wireshark to check your program observations (if your program is a packet sniffer) or behavior (if your program is a client/server/layer XYZ protocol implementation).
On the 802.11 strategies regarding active vs passive scan and power saving:
From "802.11 Wireless Networks: The Definitive Guide, 2nd Edition", by Matthew S. Gast ("member of the IEEE 802.11 working group, and serves as chair of 802.11 Task Group M. As chair of the Wi-Fi Alliance's Wireless Network Management marketing task group, he is leading the investigation of certification requirements for power saving, performance optimization, and location and timing services" - from his publisher bio). A book i can highly recommend.
p. 171:
p. 172:
Also, a bit old (2003), but these guys know their stuff about networking. About scanning strategies:
From Cisco "802.11 Wireless LAN Fundamentals", chapter 5 "mobility".
Page 153:
Page 154 "Determining Where to Roam":
Other interesting stuff on page 155, "Preemptive AP Discovery".