• GraphQL Server-side Journey with Python

    After playing around with few Python GraphQL libraries for a few weeks, I realized that a good GQL python lib should: - be less invasive, work on top of the existing stack (FastAPI/starlette), reuse as much code as possible(Pydantic) - generate GQL schema from python code, ideally from built-in types and Pydantic types - supports Subscriptions out of the box

    Currently, I’m happy with Ariadne in a code-first approach. This post tracks the journey with issues we found and the workarounds/solutions.

    Graphene

    Both graphql.org and fastapi point to https://graphene-python.org/, so we get started with it. 

    as you may or may not know, GraphQL has a concept called “Schema”,  Graphene took “a code-first approach”, which is cool:  

     Instead of writing GraphQL Schema Definition Language (SDL), we write Python code to describe the data provided by your server.

    ## Hello world works weel, but it’s too verbose

    import graphene
    
    class Query(graphene.ObjectType):
      hello = graphene.String(name=graphene.String(default_value="World"))
    
      def resolve_hello(self, info, name):
        return 'Hello ' + name
      
    
    schema = graphene.Schema(query=Query)
    result = schema.execute('{ hello }')
    print(result.data['hello']) # "Hello World"
    

    Looks simple yet still complex.  there’re too many graphenes, why I need to learn another typing system?  which can be done by the framework,  what about this one? 

    # hello = graphene.String(name=graphene.String(default_value="World"))
    hello: str = "World" 
    

    Reuse Pydantic types with graphene-pydantic

    Since we’re using Pydantic,  which has all the typing details, why not simply use Pydantic?! https://github.com/graphql-python/graphene-pydantic is exactly what we need! but even with graphene-pydantic an adaptor layer is required between Pydantic and Graphene, e.g.: 

    
    class PersonInput(PydanticInputObjectType):
        class Meta:
            model = PersonModel
            # exclude specified fields
            exclude_fields = ("id",)
    
    class CreatePerson(graphene.Mutation):
        class Arguments:
            person = PersonInput()
        # more code trimmed
    

    Still very verbose, but much better than the original one. 

    ## Subscriptions is not well supported yet

    The document is super confusing: https://docs.graphene-python.org/projects/django/en/latest/subscriptions/:

    To implement websocket-based support for GraphQL subscriptions, you’ll need to do the following: Install and configure django-channels. Install and configure* a third-party module for adding subscription support over websockets. A few options include: graphql-python/graphql-ws datavance/django-channels-graphql-ws jaydenwindle/graphene-subscriptions Ensure that your application (or at least your GraphQL endpoint) is being served via an ASGI protocol server like daphne (built in to django-channels), uvicorn, or hypercorn.

    • Note: By default, the GraphiQL interface that comes with graphene-django assumes that you are handling subscriptions at the same path as any other operation (i.e., you configured both urls.py and routing.py to handle GraphQL operations at the same path, like /graphql).

    what? why Django gets mentioned? I’m not interested and I’m lost!

    Maybe it’s time to move on.

    Ariadne

    This is from the Graphene’s “Getting started”:

     Compare Graphene’s code-first approach to building a GraphQL API with schema-first approaches like Apollo Server (JavaScript) or Ariadne (Python). Instead of writing GraphQL Schema Definition Language (SDL), we write Python code to describe the data provided by your server.

    Yeah, schema-first is not cool, but Ariadne’s documetation looks much better than Graphene.

    Subscriptions, it just works

    After the experience with Graphene, the first feature I check was subscriptions:   https://ariadnegraphql.org/docs/subscriptionsit’s simple and it just works!  the documentation is clean and no django mentioned at all!

    import asyncio
    from ariadne import SubscriptionType, make_executable_schema
    from ariadne.asgi import GraphQL
    
    type_def = """
        type Query {
            _unused: Boolean
        }
    
        type Subscription {
            counter: Int!
        }
    """
    
    subscription = SubscriptionType()
    
    @subscription.source("counter")
    async def counter_generator(obj, info):
        for i in range(5):
            await asyncio.sleep(1)
            yield i
    
    
    @subscription.field("counter")
    def counter_resolver(count, info):
        return count + 1
    
    
    schema = make_executable_schema(type_def, subscription)
    app = GraphQL(schema, debug=True)
    

    Schema first? it doesn’t have to be

    What if I change the counter_generator to return str? I need to update the Schema. if I forgot that, I’m lying to my users.  I hate it. 

    In the above example, type_def is kind fo the duplication of the method counter_generator (if we add return type) like:

    async def counter_generator(obj, info) -> int
    

    the Schema looks reasonably easy to generate, why cannot we generate a schema from python code? especially with Pydantic? if we define a method with proper tying, we cloud to generate the Schema easily:

    
    class HelloMessage(BaseModel):
        body: str
        from_user: UUID
    
    
    query = QueryType()
    
    
    @query.field('hello')
    def resolve_hello(_, info) -> HelloMessage:
        request = info.context['request']
        user_agent = request.headers.get('user-agent', 'guest')
        return HelloMessage(
            body='Hello, %s!' % user_agent,
            from_user=uuid4(),
        )
    
    
    # Generate type_defs from Pydantic types in the query definition.
    type_defs = generate_gql_schema_str([query])
    
    schema = make_executable_schema(
        type_defs, query, snake_case_fallback_resolvers,
    )
    app = GraphQL(schema, debug=True)
    

    the details could be found here: https://github.com/gary-liguoliang/ariadne-pydantic/blob/master/example/main.py

    With a small schema generation utility,  we managed to run Ariadne in a code-first approach

    • Code is much simpler than the original version of both Ariadne and Graphene 
    • Resuing Pydantic typing 
    • the GQL query definition method is very simple, take input, forward to the core application, return the output. 
  • Speed Up Your Django Tests

    I read the book “Speed Up Your Django Tests” this week, a few interesting items: 

    Background/disclaimer: I’m new to Django,  I use pytest to run many integration Django tests. so the points listed here are purely from my point of view.

    1. Override settings: with @override_settingsin case you want to override a setting for a test method, Django provides the override_settings() decorator (see PEP 318).
    2. Show slow tests with pytest --durations 10
    3. Tests marker, categorize/tag tests so that can run different subsets.  like JUnit categories for more details: https://docs.pytest.org/en/latest/example/markers.html
    4. Reduce pytest test collection by setting norecursedirs
    5. Run in parallel with pytest-xdist
    6. Django’s RequestFactory: This is similar to the test client, but instead of making requests, “provides a way to generate a request instance that can be used as the first argument to any view” Django Doc
    7. Django’s SimpleTestCase:  a subclass of unittest.TestCase, it “disallows database queries by default.”,  however, you till can turn it on.
    8. Avoid Fixture Files[11.1],  “For data you need in individual tests, you’re better off creating it in the test case or test method.” I have to see it’s very easy to set up test data with fixtures, but shortly it becomes unmanageable few valid points: 

      Fixture ˉles are separate from the tests that use them. This makes it hard to determine which tests use which objects. The ˉles tend to become “append-only,”…when a new test needs a new object, it tends to be added to an existing file…if there’s some data that most of your application depends on, using a fixture, causes unnecessary reloading. It will be loaded and then rolled back for each test case, even when the next test case needs the exact same data.

    Overall, I would say it’s a good Django testing book for newbies like me, the book also covers many other topics, such as “Profiling”, “Mocking” etc, and many topics and links for me to explore Django. overall, I would say it’s a good Django testing book for newbies like me.

    However, slow tests generally indicate design issues. all the techniques mentioned in the book definitely can help to speed up the testing(itself), if we take one steps further, should we start thinking about the design?

    Abstraction

    from: Architecture Patterns with Python

    if we cloud fundamentally resolve some design issues, I believe we’ll get much fewer integration tests.

  • How to get a software engineering job in Singapore

    If you’re thinking about it, you might want to start with these two websites:

    1. MyCareersFuture.sg

    This is a portal that aims to provide Singapore Citizens and Permanent Residents with a fast and smart job search service to match them with relevant jobs… The portal was developed by Workforce Singapore, in partnership with the Government Technology Agency.

    This is not just another government website:

    From 1 Aug 2014, under the Fair Consideration Framework (FCF) by the Ministry of Manpower (MOM), companies seeking to hire Employment Pass (EP) holders are required to post their job vacancies on MyCareersFuture.sg for at least 14 calendar days before an EP application is submitted to MOM. For more information on FCF, click here.

    So technically speaking, the HR department will list the job vacancies as early as possible.

    This is not all, most important feature for me is the salary info, even though only a range, it really helps me understand the market (on some level), for example, by reading few lists here, I know that:

    Some of the packages may have different components, but the range gives me much better visibility of the market.

    2. efinancialcareers.sg

    As you may know that many financial institutions have development teams in Singapore, so you better set up your profile at efinancialcareers.sg — even if you have no interest in this industry.

    Setting a profile here will attract most of the recruitment agencies, they will try to talk to you, share their opportunities. Be open-minded, talk with them, they will try their best to find a matching opportunity for you.

    However, other websites such as “Linkedin”, “Indeed” are also very helpful, but for me, “MyCareersFuture” and “efinancialcareers” are the most effective ones.

    Good luck!

  • Google Cloud Tasks: use queue.xml to control rate for slow queues

    We got a service that has an HTTP request rate limit: less than 1 message per 10 seconds. we don’t use this service frequently, but when we use it, we send two requests sequentially, as expected, we received few http 429 errors.

    I kind of agree that it’s my responsibility to control the rate, but I don’t want to let my code aware of these constraints, so we decided to let Google Tasks control the rate.

    First try

    I don’t know much about token bucket, by glancing the help doc, I think it will help by setting --max-dispatches-per-second=0.01  (1 message / 10 seconds) with:

    cloud tasks queues update my-task-queue --max-concurrent-dispatches=1  --max-dispatches-per-second=0.01
    

    however, we noticed that HTTP 429 persists after the change, task queue log shows tasks are dispatched almost at the same time.  until we checked the maxBurstSize

     Each queue has a token bucket that holds tokens, up to the maximum specified by maxBurstSize. Each time a task is dispatched, a token is removed from the bucket. Tasks will be dispatched until the queue’s bucket runs out of tokens. The bucket will be continuously refilled with new tokens based on maxDispatchesPerSecond. and this field is an output value of gcloud, gcloud tasks queues describe my-task-queue shows the maxBurstSize is 10.

    so the bucket should have 10 tokens initially, even though I set the rate, but in my case, the first call will get run immediately because 10 tokens are available right there.  read the document again, and I found:

     In most cases, using the Cloud Tasks API method and letting the system set max_burst_size produces a very efficient rate for managing request bursts. In some cases, however, particularly when the desired rate is relatively slow, either using the queue.yaml method to manually set bucket_size to a small value, or setting your max_concurrent_dispatches to a small value via the Cloud Tasks API can give you more control. https://cloud.google.com/tasks/docs/configuring-queues#rate

    Second try

    set bucket_size to 1 using queue.yaml, task queue log shows tasks are dispatched right at the rate I set. 

    That’s not all,  you’d better to read this one before using queues.xml:

    Pitfalls of mixing queue.yaml and gcloud command

    and these posts also help:

  • Working at a startup for 6 months

    It has been (almost) 6 months with my new role at a startup, I feel like that I’m running everyday and have no time to really think about the change. Is this kind of the “new stupid”? I hope it’s not, so I want to summarize and write it down the difference I learned.

    I managed to concentrate on my work and “run as fast as I can”

    Well, I’m not saying that I didn’t concentrate on my work in my previous jobs. What I mean is I literally spend all my time on my work, not on meetings, emails, reports etc,. I used to spend a lot of time updating my manager on what I was doing and I needed to produce nice reports to my managers and sometimes, my manager’s managers.

    The bottleneck is not the infrastructure team or change management team, it’s just me. I have access to all products, I can deploy or destroy the production environment anytime, as long as I’m aware of what I’m doing. That’s cool!

    Delivery the product, not beautiful code

    I used to work in a relatively bigger team with ~20 developers, splitted to 3 prods focused on different products. Most of the banks I worked for have “change-freeze” periods, mostly December will be one of them. One day, my manager declared that we’ll do “platform enhancements” with a few nice diagrams. However, at the moment we all know that our product got few bugs like users cannot buy our products just because they don’t have a valid date of birth in the CRM.

    I’m not very happy with it because the proviritoy is wrong to me, if I receive 10% of each transaction, I’ll fix all the known production issues ASAP. but I’m not receiving any commision, neither does my manager. So we had a nice holiday season on refactoring. I enjoyed it, but I don’t think stakeholders will be happy if they really understand what we’re doing.

    Small startups are completely different, with limited resources, we need to deliver a working product and get actual feedback from the markets. They might like the product because of a nice design, but definitely not because of my beautify code.

    Size Does Matter

    I don’t really understand this when I first read it few years back, but now I do:

    We’re small and we like it that way. It gives us the ability to turn on a dime, deliver projects quickly, and dedicate extraordinary attention to your assignment. Our size allows us to work on projects we want to do rather than projects we have to do. Plus we can all fit in one cab if we squeeze. [https://37signals.com/05.html]

    Do everything and get the job done

    In my previous job, I and one of my teammates acted as the “DevOps” team for one year, there’s a big gap from code and functionally product, and nobody wants to get hands dirty to fix the gap. we decided to do everything required to deliver the code to a product, without writing much code for the product. we handle many tasks from managing stakeholder’s expectations, involved in system design (we gave up soon because of manager-driven design), code review, quality control, deployment, production support.

    the mindset is different here, we don’t have dedicated team to help you on database, messaging queue, network etc.  so in 6 months, I picked python and Django, got my hands dirty with Google cloud, delivered few features, and of course, fixed bugs created by myself. 


subscribe via RSS