I have an android smartphone connecting via WIFI to an embedded AP. I am sniffing WIFI traffic with a laptop running Tshark on Linux. I am transferring small (234 bytes) TCP packets 5 times every 100ms, followed by 500ms with no data. Periodically, packets will be ignored, forcing retransmission. Some level of packet retransmission is expected when transferring data over TCP sockets, but this is excessive. Especially so because the packets are received by the sniffer without problem (i.e., there is no ‘in flight’ corruption), AND subsequent re-transmitted packets are also ignored (dropped) by android.
Please note: When I say ‘IGNORED’, I mean I don’t see any acknowledgement, either from the receiving radio (802.11 ACK), or from receiving TCP stack (TCP ACK) in the Wireshark trace. In the image below, data packet 599
is acknowledged immediately, while the next data packet (606) is ignored, as are the next four re-transmissions (607-610).
This DOES NOT HAPPEN with the same APK running on Android v4.x (4.0.4 & 4.2.2 tested), nor on a similar application running on iOS. In both cases, there IS the expected dropped TCP packet, followed by a retransmission, but that is almost always ACK’ed immediately.
Since there is nothing special about the packets I am transferring, and seamless retransmission of TCP packets is expected to some extent to address packet loss, I suspect that many applications are suffering this same issue, but simply accept the reduction in overall transfer rate, attributing the slower transfer to “WIFI congestion”.
In my case, the WIFI hardware/firmware solution (provided by a third party) chokes up with excessive TCP backlog occurs. They have a new solution with a bandaid fix that appears to help with recovery, but that does not resolve the problem of hardware already in the field.
I believe something has changed under android 5.x such that the radio is periodically either shut down or shifted off-channel. Since it appears to happen periodically, I though this might be due to channel scanning (with the expectation that any packets dropped during the scan interval would be recovered by retransmissions when the radio returned). I was unable to find a clear definition of Android channel scanning behavior (I have a separate post on that).
When failed, retransmission will occur in bursts of 4 or 5 packets with several milliseconds in between, then nothing for 1 second, followed by another burst. Usually, retransmitted packets are ignored for some time (could be multiple seconds). Eventually, a retransmission packet is acknowledged, and transmission continues with the next packet.
I noted that there was also periodic DNS probing occurring (list of URLs that are periodically probed via DNS queries is below). Some may be installed applications, while others will be operating system (for instance, ‘connectivitycheck.android.com’ will likely be used to verify if the AP actually provides internet connectivity, which it does NOT in my case).
android.clients.google.com
android.googleapis.com
clients3.google.com
cmdts.ksmobile.com
connectivitycheck.android.com
graph.facebook.com
mtalk.google.com
px.demdex.net
weather.ksmobile.com
The px.demdex.net entry drew my attention, as I had no idea who this represented. It turned out to be Adobe. Since there were clearly bits of code actively probing that I had no knowledge of, and to ensure my issue is purely an Android issue, I factory reset the Nexus 4 smartphone, and declined any initialization options that might be implicated in WIFI comms issues (such as Google Location services, allow to use WIFI even when off, etc.). TCP packets are still dropped.
On the android side, I have: - Set Location to “Device Only” (GPS) - Disabled WIFI “Scanning always available” - Acquired a “WIFI_MODE_FULL_HIGH_PERF” lock within the app whenever the socket is open. - Socket is opened when app starts communication, and held open until app closes. If no packets appear for some time (currently 3 seconds), app closes socket & opens a new one in an attempt to force comms to re-establish.
I discovered the “setAllowScansWithTraffic()” function of IWifiManager, which sounds like it would give me the explicit control of scanning (to remove that possibility from the list), but it seems implementing this would be difficult, and could not be part of my app (?). I believe IWifiManager provides a stub for android implementers to build their own WIFI manager (OS) service, and it is not intended to be used at the app level.
I’d appreciate any input / suggestions.
UPDATE
While chasing the "channel scanning" possibility, I logged ACK packets on WIFI channel 3 with my packet sniffer, while the hardware under test was operating on WIFI channel 8. This may explain the missing ACKs - they are sent on the wrong WIFI channel. I've opened an Android bug