The Ops Community

ujjavala
ujjavala

Posted on

GOing down the rabbit hole

I recently upgraded a service written in golang which was deployed using Google's AppEngine and I have just one word for the experience. It was unpleasant. Just to be clear, I really am all in for golang and was also impressed with how easy it is to deploy any service using gcloud. Unfortunately the ride from go1.1 to go.1.2+ was more of a roller coaster for me. Let’s take a glimpse of this three-course meal together so that we can be well prepared for the next upgrade.

For appetizers

I had already run my code locally using dev_appserver.py --enable_host_checking=no --support_datastore_emulator=yes app.yaml , verified the entries in the datastore and was fairly satisfied with the code that I had written.

I was all set to deploy my service on go111 if I hadn’t seen the error on my console related to a private repository reference. In order to resolve this issue, I leveraged go vendor (introduced go 1.15 onward), which would copy all third-party dependencies to a vendor folder in your project root. I quickly updated my app.yaml file and specified the version there as go115. Fortunately, the reference error was resolved and I could deploy the service.

Here comes the main course

Deployment with go115 was successfully done, the health endpoint worked too. I was all happy and I started celebrating by updating the README.md file with emojis and refactoring the code here and there. But, my happiness was short-lived when I found out that the other endpoints didn’t work.

While accessing other endpoints I got a metadata fetch failed: metadata server returned HTTP 404 error was seen error. I googled for a while but didn’t find the exact cause. I tried to fix the issue with a few of these stackoverflow suggestions along with a few others, but didn't have any luck there. It took almost 2 days for me to figure out that I had to upgrade the AppEngine version. I bumped up the version and tada... I could access the other endpoints too.

Finally, the dessert

I could see the css and labels for the page loading, but I couldn’t see any data there. Data, without which the page was just a skeleton, without any essence.

I traversed back to the logs and found yet another error internal.flushLog: Flush RPC: Call error 7: App Engine APIs are not enabled, please add app_engine_apis: true to your app.yaml popping up. The error made complete sense to me and I did just what it had suggested. I added the flag in the app.yaml file and quickly deployed the app.

And tada... the endpoint does not have any data.

I was again lost in midst of suggestions and comments and after navigating through all the pages in google (10 to be precise), I found nothing.

Got a hunch that maybe it's again related to some other upgrade, and since it has something to do with data, upgraded datastore. Imported the datastore from cloud.google.com/go/datastore instead of google.golang.org/appengine and made the code compatible since the apis were a bit different. Found this and this really helpful). I deployed the code to the dev environment and finally… the fix worked beautifully and I could see the data there.

Mint anyone?

Few findings on top of my head

  1. Though the code was deployed successfully , I could not test the appengine-datastore setup on my local machine. For the standard environment, there is no documentation available for local setup for go 1.12+ versions. Before the upgrade, I had referred to gcloud’s official document, but this didn’t work for go1.12+ versions.
  2. Testing just the AppEngine locally was difficult for go1.12+ versions. I observed that even the document recommends doing just a go run
  3. Just running the datastore emulator is possible using gcloud beta emulators datastore start. But, again this is not very helpful if you need AppEngine to run too.
  4. There were many incompatibility issues in between AppEngine and datastore, even if you haven’t upgraded yet. In order to test appengine the recommended solution is to use GOOGLE_CLOUD_PROJECT= <projectId> <project_folder_path> . But this is incompatible if you are using datastore. For datastore, you would need the deprecated dev_appserver.py way to test out things.

I felt that it would have been a lot smoother, if the steps for local development were more articulate and the error messages in gcloud were intuitive (they were really misleading). Few things worked for me and few things didn’t. But that was just my experience I guess, which is very subjective by the way. Not everyone might be running into these issues on a daily basis.

What we can hope for is that, if any one of us did stumble upon these issues, we know what our modus operandi is going to be and we know exactly how we are going to get our peace of mind back.

Keep calm and happy coding!

Discussion (0)