SafeGraph provided three questions as part of the hiring process for a Technical Product Manager. Below are my answers for the three questions.
You can view a pdf of the questions here.
Viewing notebook:
Load and initialize notebook extensions, install R packages as needed, and load dependencies and setup globals.
✔ Extensions loaded
✔ dotenv initialized
✔ R packages installed
✔ python dependencies loaded
You are going to launch a new API meant for data science users and you want to have at least one client library ready at launch. Do you build a client in R, Python or both? How do you decide?
That depends on the expected behavior and knowledge of the user base.
Initially I had these questions:
Eventually I decided to just implement a basic call myself. I tried the query used in the cURL version of the directions, but something about the JSON encoding wasn't working correctly. I saw that the query in the cURL example looked like a serialized version of the GraphQL query, so I tried using the GraphQL query as a multiline string while using Python's standard json library to convert (see FIG 1A). Success! I also implemented it using the provided Python client library (see FIG 1B). And I implemented it in R, cheating a bit by reusing the query string generated by Python (see FIG 1C). None of the responses were exactly the same, but the data itself matched in all three cases!
Given what I know, I would still want the answers to the above questions, but now I would err towards having a more complete solution in one of the given languages. Authorization isn't too hard so if we can't get a client libary out in time, we should be able to produce docs with common use cases for the other language. I have never used R before and figured out a basic API call in less than an hour.
{'data': {'lookup': {'safegraph_core': {'brands': [{'brand_id': 'SG_BRAND_f116acfe9147494063e58da666d1d57e',
'brand_name': 'starbucks '
'coffee'}],
'category_tags': ['Snacks',
'Counter Service',
'Dessert',
'Tea House',
'Coffee Shop',
'Bakery'],
'city': 'San Francisco',
'closed_on': None,
'geometry_type': 'POLYGON',
'iso_country_code': 'US',
'latitude': 37.769035,
'location_name': 'Starbucks',
'longitude': -122.42775,
'naics_code': 722515,
'open_hours': '{ "Mon": [["5:30", '
'"19:30"]], "Tue": '
'[["5:30", "19:30"]], '
'"Wed": [["5:30", '
'"19:30"]], "Thu": '
'[["5:30", "19:30"]], '
'"Fri": [["5:30", '
'"19:30"]], "Sat": '
'[["5:30", "19:30"]], '
'"Sun": [["5:30", '
'"19:30"]] }',
'opened_on': None,
'parent_placekey': '222-226@5vg-7gr-6kz',
'phone_number': None,
'placekey': '222-224@5vg-7gr-6kz',
'postal_code': '94114',
'region': 'CA',
'safegraph_brand_ids': 'SG_BRAND_f116acfe9147494063e58da666d1d57e',
'street_address': '2020 Market St',
'sub_category': 'Snack and '
'Nonalcoholic Beverage '
'Bars',
'top_category': 'Restaurants and Other '
'Eating Places',
'tracking_closed_since': '2019-07-01'}}},
'extensions': {'row_count': 1, 'version_date': '1630442778__2021_08'}}
{'brands': {0: [{'brand_id': 'SG_BRAND_f116acfe9147494063e58da666d1d57e',
'brand_name': 'starbucks coffee'}]},
'category_tags': {0: ['Snacks',
'Counter Service',
'Dessert',
'Tea House',
'Coffee Shop',
'Bakery']},
'city': {0: 'San Francisco'},
'closed_on': {0: None},
'geometry_type': {0: 'POLYGON'},
'iso_country_code': {0: 'US'},
'latitude': {0: 37.769035},
'location_name': {0: 'Starbucks'},
'longitude': {0: -122.42775},
'naics_code': {0: 722515},
'open_hours': {0: '{ "Mon": [["5:30", "19:30"]], "Tue": [["5:30", "19:30"]], '
'"Wed": [["5:30", "19:30"]], "Thu": [["5:30", "19:30"]], '
'"Fri": [["5:30", "19:30"]], "Sat": [["5:30", "19:30"]], '
'"Sun": [["5:30", "19:30"]] }'},
'opened_on': {0: None},
'parent_placekey': {0: '222-226@5vg-7gr-6kz'},
'phone_number': {0: None},
'placekey': {0: '222-224@5vg-7gr-6kz'},
'postal_code': {0: '94114'},
'region': {0: 'CA'},
'safegraph_brand_ids': {0: 'SG_BRAND_f116acfe9147494063e58da666d1d57e'},
'street_address': {0: '2020 Market St'},
'sub_category': {0: 'Snack and Nonalcoholic Beverage Bars'},
'top_category': {0: 'Restaurants and Other Eating Places'},
'tracking_closed_since': {0: '2019-07-01'}}
List of 2
$ data :List of 1
..$ lookup:List of 1
.. ..$ safegraph_core:List of 22
.. .. ..$ placekey : chr "222-224@5vg-7gr-6kz"
.. .. ..$ latitude : num 37.8
.. .. ..$ longitude : num -122
.. .. ..$ street_address : chr "2020 Market St"
.. .. ..$ city : chr "San Francisco"
.. .. ..$ region : chr "CA"
.. .. ..$ postal_code : chr "94114"
.. .. ..$ iso_country_code : chr "US"
.. .. ..$ parent_placekey : chr "222-226@5vg-7gr-6kz"
.. .. ..$ location_name : chr "Starbucks"
.. .. ..$ safegraph_brand_ids : chr "SG_BRAND_f116acfe9147494063e58da666d1d57e"
.. .. ..$ brands :List of 1
.. .. .. ..$ :List of 2
.. .. .. .. ..$ brand_id : chr "SG_BRAND_f116acfe9147494063e58da666d1d57e"
.. .. .. .. ..$ brand_name: chr "starbucks coffee"
.. .. ..$ top_category : chr "Restaurants and Other Eating Places"
.. .. ..$ sub_category : chr "Snack and Nonalcoholic Beverage Bars"
.. .. ..$ naics_code : int 722515
.. .. ..$ phone_number : NULL
.. .. ..$ open_hours : chr "{ \"Mon\": [[\"5:30\", \"19:30\"]], \"Tue\": [[\"5:30\", \"19:30\"]], \"Wed\": [[\"5:30\", \"19:30\"]], \"Thu\""| __truncated__
.. .. ..$ category_tags :List of 6
.. .. .. ..$ : chr "Snacks"
.. .. .. ..$ : chr "Counter Service"
.. .. .. ..$ : chr "Dessert"
.. .. .. ..$ : chr "Tea House"
.. .. .. ..$ : chr "Coffee Shop"
.. .. .. ..$ : chr "Bakery"
.. .. ..$ opened_on : NULL
.. .. ..$ closed_on : NULL
.. .. ..$ tracking_closed_since: chr "2019-07-01"
.. .. ..$ geometry_type : chr "POLYGON"
$ extensions:List of 2
..$ row_count : int 1
..$ version_date: chr "1630442778__2021_08"
In the first iteration of an API, the engineer creates a response that looks like this:
You notice that there is both a “safegraph_brand_ids” field and a “brands” field. Do you keep both? If not, which one do you keep? How do you decide?
“Don't ever take a fence down until you know the reason it was put up.”
― G. K. Chesterton
Again, my approach would be to gather a little more data before making a decision. Below I outline my thinking and approach.
Main questions:
It looks like this is based on actual data, given the response above. So we should have some users we can talk to, who uses it for what and what does it cost us? Based on those answers we can either: sunset the redundant data with clear docs on other patterns that can solve the same problems -or- keep them both but update the docs with clear use cases and what to do if the data ever doesn't match.
How would you improve this example code snippet in the docs?

In order to understand the code a little better, I decided to implement it.
Notes:
cURL instructions of the docs. But still, we get an error when parsing the payload (see FIG 3B)Graph QL portion of the docs (see FIG 3C)To improve the example snippet (see FIG 3C):
json.dumps() to encode the string literalresponse.json() instead of response.text (or use json.loads(response.text)) Taking the steps above will make the example snippet functional, easier to maintain, and more usable.
Ideally we would not be using string literals to build queries. Debugging can be a pain, which is why I opted for trying a workaround first.
As written, the snippet doesn't work. The payload line was truncated in the screenshot, giving us the error:
{"error":"Invalid JSON"}
{'error': 'Invalid JSON'}
{'data': {'lookup': {'placekey': '224-222@5vg-7gv-d7q',
'safegraph_core': {'category_tags': ['Counter Service',
'Late Night',
'Lunch',
'Fast Food',
'Drive Through',
'Breakfast',
'Mexican Food',
'Dinner'],
'location_name': 'Taco Bell',
'phone_number': '+14159791587',
'postal_code': '94107',
'street_address': '710 3rd St'}}},
'extensions': {'row_count': 1, 'version_date': '1630442778__2021_08'}}
{'category_tags': {0: ['Counter Service',
'Late Night',
'Lunch',
'Fast Food',
'Drive Through',
'Breakfast',
'Mexican Food',
'Dinner']},
'location_name': {0: 'Taco Bell'},
'phone_number': {0: '+14159791587'},
'placekey': {0: '224-222@5vg-7gv-d7q'},
'postal_code': {0: '94107'},
'street_address': {0: '710 3rd St'}}