I build a script with scapy to capture probe requests in a monitornig wi-fi interface.
I successfully capture the requests, and some of the SSIDs contained in them. But most of the networks stored in the phone don't get broadcasted.
And there isn't a clear pattern of why this happens. Some phones don't broadcast ssids at all.
I'm trying to find an explanation for the reasoning behind this behaviour, but haven't found any, apart that the hidden networks should be broadcasted in order for the phone to connect to them, but even that is not true, and most of the broadcasted ones are visible.
Another behaviour is the iPhones, that only seem to broadcast the network that they are connected to, and nothing else. (no network -> no SSIDs).
I have tried putting the interface in various channels, and results vary on the broadcasted networks, but the great majority of the saved ones in the device still aren't broadcasted.
Is there a reason behind this? Or a way to force the device to broadcast them all?
You seem to assume that the phone would do a probe request for each and every known network, permanently.
This is not the case - and not just for phone, but in general. Quoting the Wi-Fi Alliance[*]:
What are passive and active scanning?
The reason for client scanning is to determine a suitable AP to which
the client may [emphasis mine] need to roam now or in the future. A client can use two
scanning methods: active and passive. During an active scan, the
client radio transmits a probe request and listens for a probe
response from an AP [emphasis mine]. With a passive scan, the
client radio listens on each channel for beacons[emphasis mine again]
sent periodically by an AP. A passive scan generally takes more time,
since the client must listen and wait for a beacon versus actively
probing to find an AP. Another limitation with a passive scan is that
if the client does not wait long enough on a channel, then the client
may miss an AP beacon.
So this is entirely application/OS dependent if
the phone STA do an active scan, sending probe requests,
or just seat there listening for beacons (or doing nothing at all).
In my remembering - it's been a few years I didn't worked/looked at Android code, so it may have change - Android will not do an active scan, and thus will not send probe request to known SSID, unless you're in the Wi-Fi networks setting screen. It will just listen to beacons.
There are some Wi-Fi 802.11 design rationale behind this:
STA are supposed to be mobile. After all, if you're not moving from
time to time, there's not much point in using Wi-Fi (except marketing
or laziness, and of course smartphones changed that), you might as
well get wired.
...if you're mobile, it's reasonable to think you're running on a
battery,
And so you want to save battery life: so you'll rather do passive
scans listening to beacons rather than active scan sending probe
request, because this uses less power.
This idea of power saving alternative capabilities is spread all other the place in 802.11 design, hidden under carpet, when you're a STA.
So it is fully OS stack/application dependent from the STA if it 1/ just listen to beacons /2 actively send probe-request for every know AP 3/ send a broadcast probe-request, and also if it do so in a continuous manner, or periodically, or depending if it's in a know state (ex screen ON, and user going to the Wi-Fi networks setting screen).
Now there may be some other considerations, like some regional regulations that mandate that you first listen to beacons to decide if you can or cannot use some channels. But the main point is above.
*:
http://www.wi-fi.org/knowledge-center/faq/what-are-passive-and-active-scanning
EDIT:
On the programming side:
1/ What you seem to have is an IOP (interoperability) problem, because you expect a specific behavior from STA regarding scanning active vs passive and the involved probe-requests, and this is not how it works in the real world. Depending on your application final main goal, this may be a flawn in the design - or just a minor nuisance. You may want to restrict yourself to some specific device's brand, or try to cover all cases, which has a development cost.
2/ ...OR you were just surprised by your observations, and look for an explanation. In such case of surprising results, it goes without saying: go straight to wireshark to check your program observations (if your program is a packet sniffer) or behavior (if your program is a client/server/layer XYZ protocol implementation).
On the 802.11 strategies regarding active vs passive scan and power saving:
From "802.11 Wireless Networks: The Definitive Guide, 2nd Edition", by Matthew S. Gast ("member of the IEEE 802.11 working group, and serves as chair of 802.11 Task Group M. As chair of the Wi-Fi Alliance's Wireless Network Management marketing task group, he is leading the investigation of certification requirements for power saving, performance optimization, and location and timing services" - from his publisher bio). A book i can highly recommend.
p. 171:
ScanType (active or passive)
Active scanning uses the transmission of Probe Request frames to
identify networks in the area. Passive scanning saves battery power by
listening for Beacon frames.
p. 172:
Passive Scanning
Passive scanning saves battery power because it does not require
transmitting. In passive scanning, a station moves to each channel on
the channel list and waits for Beacon frames.
Also, a bit old (2003), but these guys know their stuff about networking. About scanning strategies:
From Cisco "802.11 Wireless LAN Fundamentals", chapter 5 "mobility".
Page 153:
Roaming Algorithms
The mechanism to determine when to roam is not defined by the IEEE
802.11 specification and is, therefore, left to vendors to implement. [...] The fact that the algorithms are left to vendor implementation
provide vendors an opportunity to differentiate themselves by creating
new and better performing algorithms than their competitors. Roaming
algorithms become a vendor’s “secret sauce,” and as a result are kept
confidential.
Page 154 "Determining Where to Roam":
There is no ideal technique for scanning. Passive scanning has the
benefit of not requiring the client to transmit probe requests but
runs the risk of potentially missing an AP because it might not
receive a beacon during the scanning duration. Active scanning has the
benefit of actively seeking out APs to associate to but requires the
client to actively transmit probes. Depending on the implementation
for the 802.11 client, one might be better suited than the other. For
example, many embedded systems use passive scanning as the preferred
method [emphasis mine] [...]
Other interesting stuff on page 155, "Preemptive AP Discovery".