0%

I had to build a web scraper to buy groceries

Grocery shopping has been one of my least favorite chores, even before the pandemic. Unlike many people I know, I always preferred online shopping over going to the supermarket. Now it’s not a matter of personal preference anymore; everybody stays at home and especially avoids crowded indoor places like supermarkets. It’s good to know that people are acting responsibly, but I never thought this could mean that I won’t be able to do online grocery shopping anymore!

Here’s the deal; very few supermarket chains in Turkey have online stores. Migros Sanalmarket is one of them and it’s arguably the best one. But they don’t have unlimited resources, obviously. When everybody decided to switch to online shopping all of a sudden, they couldn’t handle that demand spike. Even though their delivery system works from 8:30 AM to 10:00 PM every day, it’s virtually impossible to find an empty slot, that is if you play nicely.

I’ve frequently visited the online store for the past two weeks and attempted to place an order, but I always got this message (translated from Turkish):

“We do not have delivery to the neighborhood you have chosen for the next 4 days.”

Probably as soon as new slots are opened, they get occupied in minutes by a few dozen lucky people who happen to be online at that time.

I didn’t want to resort to violence but after yet another failed attempt yesterday, I got really frustrated and finally unsheathed Chrome developer tools. I looked at the Network tab to see which HTTP request was returning the above response. I found out that there was an endpoint for checking live delivery availability, and it was getting hit automatically when a logged-in user visited the website.

The first step of my plan was to write a script that makes a GET request to that endpoint and checks if delivery is available for my neighborhood within the next 4 days. Firsty, I needed to find out what additional information I had to send with my request to make the server treat me as a logged-in user. I looked at the cookies in the Request Cookies section:

The SESSION cookie looks promising, doesn’t it? I did what anybody does when they need to quickly put together a script to get something done; I created a main.py file. The requests package is more than capable of setting cookies and making simple HTTP requests. This is how I tried to get the same response from the server programmatically:

main.py
1
2
3
4
5
6
7
8
9
10
11
import requests

# copy & paste the SESSION cookie value obtained from request headers
session = requests.Session()
jar = requests.cookies.RequestsCookieJar()
jar.set('SESSION', 'Y2IxOWE0YWQtNmM0Ni00ZWYzLTkzYmItOGI4YWQ0MDI1MTg4')
session.cookies = jar

# make a GET request to the delivery availability endpoint
r = session.get('https://www.migros.com.tr/teslimat/en-yakin-siparis-dilimi')
print(r.text)

I got this response:

1
2
3
4
5
<span class="part-area date-header mobile-hide pull-right">
<b>
Önümüzdeki 4 gün için seçtiğiniz mahalleye teslimatımız bulunmamaktadır.
</b>
</span>

This is how you say “nope” in Turkish. It was exactly what I wanted to receive. If the session cookie hadn’t worked, or if it hadn’t been enough on its own, I would have received the HTML for a login page. I know because that was what I received when I sent the request without any cookies.

So far so good. I could run a script every minute as a cron job, pipe stdout to a log file and check the logs every once in a while. So I set up the cron job and went AFK for 30 minutes. When I got back, I saw 30 failed attempts.

Obviously, I didn’t want to sit there and check the logs forever. That’s why I decided to use one of my existing Mailgun domains for sending myself an email if and when one of the attempts succeeded. It’s actually pretty easy to send an email using the Mailgun API:

main.py
1
2
3
4
5
6
7
8
9
10
requests.post(
os.getenv('MAILGUN_DOMAIN'),
auth=("api", os.getenv('MAILGUN_SECRET')),
data={
"from": os.getenv('EMAIL_FROM'),
"to": [os.getenv('EMAIL_TO')],
"subject": "Sanalmarket Is Available!!!",
"text": f"Go to {os.getenv('SHOPPING_CART_URL')}"
}
)

The only issue was that if my script ever found an available slot, it would start sending an email to me every minute until there were no slots left. I really didn’t want to spam myself. As a solution, I made it so that the script would create a .lock file the first time it received a success response and sent me an email. Also, at the beginning of each execution, it would check that .lock file and terminate immediately if it existed.

After adding the email functionality, I went AFK once more and attended to other stuff. I got an email notification on my phone after about an hour later. The sender was me, I was talking about some nonsense that had something to do with grocery shopping. I checked it out anyway. It turned out that Sanalmarket was available for delivery to my neighborhood if I placed an order immediately! I had already prepared my shopping cart, so all I needed to do was confirm the order.

I received my delivery the morning after. I’m pretty sure that I would’ve never been able to get a chance to make an order anytime soon if it wasn’t for that little hack.

That’s all there is to my mini-adventure for this weekend. My script is available on Github for those who are interested. Feel free to reach me out if you have any comments or questions. Also, make sure to subscribe if you’re interested in getting email updates on my future articles.