When developing software applications, programmers often rely on ready solutions to handle certain tasks within the project. After all, applying a ready solution instead of developing one from scratch is much faster, cheaper and just makes sense, right? Well, not always. Today I’ll tell you about our hunt to find a hidden bug in Geocoder (iOS 4.3 SDK) that occurred while working with geolocation services. Let this be an example of how sometimes it's best to apply our own custom-built solution.
Azoft company was working on a large-scale iOS application development project. The app we were developing required frequent communication with server. That's why we decided to use Google Protocol Buffers (abbreviated to Protobuf) on the server-side. Among other benefits, Protobuf allowed to facilitate support for older iOS versions.
Another feature I must mention is that the app allowed users to call a toll-free support phone number. However, there was one essential point: users located within the country should dial 8-800-..., while users calling from outside the country should dial +7-800-...
To make it easier for the user, we decided to use the MKReverseGeocoder class of the iOS MapKit framework, which allows the application to determine the user's current location and choose appropriate support phone number.
Everything was going as planned, the development phase was completed and successfully tested multiple times. No bugs were recorded. It seemed as though the app was ready for release.
Nonetheless, one day we received a message from our client that the app had crashed during testing. In fact, this crash was so critical that the client wasn't even able to continue testing the app. The error seemed to occur at random unexpected moments, after which the app wouldn't launch for quite some time. Then, after a certain period of time, the app would start working again. The client noted that the app performed this way when tested on iPhone 3GS and iPhone 4.
This being said, as much as we tried testing the app ourselves and reenacting the situation, the app did not crash. To figure out what exactly was going on, we asked the client to provide crash logs. Here’s what it looked like:
0 libobjc.A.dylib 0x3597ec98 0x3597c000 + 11416
1 ProtocolBuffer 0x32a26fb8 0x32a24000 + 12216
2 ProtocolBuffer 0x32a26cea 0x32a24000 + 11498
3 ProtocolBuffer 0x32a280f0 0x32a24000 + 16624
4 Foundation 0x31162230 0x31151000 + 70192
5 Foundation 0x31162138 0x31151000 + 69944
6 CFNetwork 0x30ddb576 0x30dcd000 + 58742
7 CFNetwork 0x30dd0fb2 0x30dcd000 + 16306
8 CFNetwork 0x30dd10ca 0x30dcd000 + 16586
9 CFNetwork 0x30dd0e34 0x30dcd000 + 15924
10 CFNetwork 0x30dd0de6 0x30dcd000 + 15846
11 CFNetwork 0x30dd0d58 0x30dcd000 + 15704
12 CFNetwork 0x30dd0cd6 0x30dcd000 + 15574
13 CoreFoundation 0x34982a72 0x3490d000 + 481906
14 CoreFoundation 0x34984758 0x3490d000 + 489304
15 CoreFoundation 0x349854e4 0x3490d000 + 492772
16 CoreFoundation 0x34915ebc 0x3490d000 + 36540
17 CoreFoundation 0x34915dc4 0x3490d000 + 36292
18 GraphicsServices 0x31db5418 0x31db1000 + 17432
19 GraphicsServices 0x31db54c4 0x31db1000 + 17604
20 UIKit 0x350ccd62 0x3509e000 + 191842
21 UIKit 0x350ca800 0x3509e000 + 182272
22 MyApp 0x0000b7b8 0x1000 + 42936
23 MyApp 0x000021d4 0x1000 + 4564
According to the crash log, the system was bypassing our application when using the Protobuf library, which seems impossible. We reviewed the entire code and confirmed that it was simply impossible to reach Protobuf bypassing our classes. But the fact that baffled us the most was that we couldn't recreate the crash ourselves.
After many unsuccessful attempts to find the error we asked the client to run the app during a video-conference call and show us the log or the device console. But during the one-and-a-half-hour Skype call the client was not able to reproduce mysterious crash. We were completely puzzled.
For another month, our team of developers and testers continued to hunt for this mysterious bug and unsuccessfully trying to make the app crash. Once again, we decided to call the client in hopes of getting the app to crash while on a conference call with us. Turns out, we were lucky this time and the app finally crashed! In any other situation we would be pretty disappointed that an app that is almost ready for release crashed so suddenly and wouldn’t launch again. In this case, however, we were excited to witness this mysterious event and finally get a chance to figure out what is wrong.
And here's what we noticed: an alert from Geocoder appeared in the console. As it turned out later, this was happening because the user's IP was banned by Google Maps for several hours in response to too many requests to the service. But the real reason why the app crashed on iOS 4.3.x was that the error message somehow didn't come to Geocoder. Therefore Geocoder couldn't process it properly. We also figured out which part of the system was using Protobuf, bypassing our application. It was the MapKit framework, communicating with Google Maps via Protobuf.
So, why was it so difficult to find this error? Because Google Maps has a limit of 2500 requests coming from a single IP address. If you have a “white” IP address or the IP is used only within a single office, the chances of exceeding this limit are very small and it is virtually impossible to repeat the crash, which was the case in our situation. However, when connecting to Wi-Fi in the middle of a large city, sometimes thousands of people might get the same IP and the chances of the app crashing increase significantly.
Despite the long search for this mysterious bug, the project was successfully completed and delivered within a reasonable timeframe. To get rid of the bug we decided to stop using MKReverseGeocoder and instead write our own custom request to Google Maps.
Apple still have not acknowledged this error in Geocoder (SDK for iOS 4.3.x) and perhaps they will soon fix it. But at the time this post was written, the bug was there. Hopefully, sharing our experience could help those dealing with similar challenges.