• Python Decorator

    What is a Python Decorator?

    Less readable code:

    def foo(cls):
        pass
    foo = synchronized(lock)(foo)
    foo = classmethod(foo)
    

    A much more readable version:

    @classmethod
    @synchronized(lock)
    def foo(cls):
        pass
    

    https://www.python.org/dev/peps/pep-0318/

    What?! Decorator on a decorator

    def cover_to_upper_case(f):
        """
        A simple decorator to covert return string upper case.
        """
        def uppercase(*args, **kwargs):
            print("upper stats....")
            r = f(*args, **kwargs)
            return r.upper()
        return uppercase
    
    
    def add_prefix(f):
        """
        A simple decorator to add a prefix to return value
        """
        def pre(*args, **kwargs):
            r = f(*args, **kwargs)
            return f"[prefix] {r}"
        return pre
    
    
    def add_prefix_and_covert_to_upper(f):
        """
        A combination of `cover_to_upper_case` and `add_prefix`
        """
        @cover_to_upper_case
        @add_prefix
        def covert(*args, **kwargs):
            r = f(*args, **kwargs)
            return r
        # also work: 
        # covert = add_prefix(covert)
        # covert = cover_to_upper_case(covert)
        return covert
    
    
    # @add_prefix
    # @cover_to_upper_case
    @add_prefix_and_covert_to_upper
    def hello():
        return "Python"
    
    
    print(f"output: {hello()}")
    
    

    In the above example: @add_prefix = add_prefix(f), @add_prefix_and_covert_to_upper = covert_to_upper_case(add_prefix(f))

    in a debugger: hello is:

    <function cover_to_upper_case.<locals>.uppercase at 0x10e8da200>
    

    hello can still be a hello if ` @wraps(f)` is added in the decorator, e.g.:

    def cover_to_upper_case(f):
         @wraps(f)
        def uppercase(*args, **kwargs):
            print("upper stats....")
            r = f(*args, **kwargs)
            return r.upper()
        return uppercase
    

    Hello is <function hello at 0x10a3bd200> now!

    @wraps is a decorator to:

    Update a wrapper function to look like the wrapped function

    What about context manager as Decorator?

    contextlib.ContextDecorator

    A base class that enables a context manager to also be used as a decorator.

    Context managers inheriting from ContextDecorator have to implement enter and exit as normal. exit retains its optional exception handling even when used as a decorator.

    How does it work?

    contextlib.ContextDecorator:

    def __call__(self, func):
        @wraps(func)
        def inner(*args, **kwds):
            with self._recreate_cm():
                return func(*args, **kwds)
        return inner
    

    so that a context manager can be used in both way:

    >>> @mycontext()
    ... def function():
    ...     print('The bit in the middle')
    ...
    >>> function()
    Starting
    The bit in the middle
    Finishing
    
    >>> with mycontext():
    ...     print('The bit in the middle')
    ...
    
  • GraphQL Server-side Journey with Python

    After playing around with few Python GraphQL libraries for a few weeks, I realized that a good GQL python lib should: - be less invasive, work on top of the existing stack (FastAPI/starlette), reuse as much code as possible(Pydantic) - generate GQL schema from python code, ideally from built-in types and Pydantic types - supports Subscriptions out of the box

    Currently, I’m happy with Ariadne in a code-first approach. This post tracks the journey with issues we found and the workarounds/solutions.

    Graphene

    Both graphql.org and fastapi point to https://graphene-python.org/, so we get started with it. 

    as you may or may not know, GraphQL has a concept called “Schema”,  Graphene took “a code-first approach”, which is cool:  

     Instead of writing GraphQL Schema Definition Language (SDL), we write Python code to describe the data provided by your server.

    ## Hello world works weel, but it’s too verbose

    import graphene
    
    class Query(graphene.ObjectType):
      hello = graphene.String(name=graphene.String(default_value="World"))
    
      def resolve_hello(self, info, name):
        return 'Hello ' + name
      
    
    schema = graphene.Schema(query=Query)
    result = schema.execute('{ hello }')
    print(result.data['hello']) # "Hello World"
    

    Looks simple yet still complex.  there’re too many graphenes, why I need to learn another typing system?  which can be done by the framework,  what about this one? 

    # hello = graphene.String(name=graphene.String(default_value="World"))
    hello: str = "World" 
    

    Reuse Pydantic types with graphene-pydantic

    Since we’re using Pydantic,  which has all the typing details, why not simply use Pydantic?! https://github.com/graphql-python/graphene-pydantic is exactly what we need! but even with graphene-pydantic an adaptor layer is required between Pydantic and Graphene, e.g.: 

    
    class PersonInput(PydanticInputObjectType):
        class Meta:
            model = PersonModel
            # exclude specified fields
            exclude_fields = ("id",)
    
    class CreatePerson(graphene.Mutation):
        class Arguments:
            person = PersonInput()
        # more code trimmed
    

    Still very verbose, but much better than the original one. 

    ## Subscriptions is not well supported yet

    The document is super confusing: https://docs.graphene-python.org/projects/django/en/latest/subscriptions/:

    To implement websocket-based support for GraphQL subscriptions, you’ll need to do the following: Install and configure django-channels. Install and configure* a third-party module for adding subscription support over websockets. A few options include: graphql-python/graphql-ws datavance/django-channels-graphql-ws jaydenwindle/graphene-subscriptions Ensure that your application (or at least your GraphQL endpoint) is being served via an ASGI protocol server like daphne (built in to django-channels), uvicorn, or hypercorn.

    • Note: By default, the GraphiQL interface that comes with graphene-django assumes that you are handling subscriptions at the same path as any other operation (i.e., you configured both urls.py and routing.py to handle GraphQL operations at the same path, like /graphql).

    what? why Django gets mentioned? I’m not interested and I’m lost!

    Maybe it’s time to move on.

    Ariadne

    This is from the Graphene’s “Getting started”:

     Compare Graphene’s code-first approach to building a GraphQL API with schema-first approaches like Apollo Server (JavaScript) or Ariadne (Python). Instead of writing GraphQL Schema Definition Language (SDL), we write Python code to describe the data provided by your server.

    Yeah, schema-first is not cool, but Ariadne’s documetation looks much better than Graphene.

    Subscriptions, it just works

    After the experience with Graphene, the first feature I check was subscriptions:   https://ariadnegraphql.org/docs/subscriptionsit’s simple and it just works!  the documentation is clean and no django mentioned at all!

    import asyncio
    from ariadne import SubscriptionType, make_executable_schema
    from ariadne.asgi import GraphQL
    
    type_def = """
        type Query {
            _unused: Boolean
        }
    
        type Subscription {
            counter: Int!
        }
    """
    
    subscription = SubscriptionType()
    
    @subscription.source("counter")
    async def counter_generator(obj, info):
        for i in range(5):
            await asyncio.sleep(1)
            yield i
    
    
    @subscription.field("counter")
    def counter_resolver(count, info):
        return count + 1
    
    
    schema = make_executable_schema(type_def, subscription)
    app = GraphQL(schema, debug=True)
    

    Schema first? it doesn’t have to be

    What if I change the counter_generator to return str? I need to update the Schema. if I forgot that, I’m lying to my users.  I hate it. 

    In the above example, type_def is kind fo the duplication of the method counter_generator (if we add return type) like:

    async def counter_generator(obj, info) -> int
    

    the Schema looks reasonably easy to generate, why cannot we generate a schema from python code? especially with Pydantic? if we define a method with proper tying, we cloud to generate the Schema easily:

    
    class HelloMessage(BaseModel):
        body: str
        from_user: UUID
    
    
    query = QueryType()
    
    
    @query.field('hello')
    def resolve_hello(_, info) -> HelloMessage:
        request = info.context['request']
        user_agent = request.headers.get('user-agent', 'guest')
        return HelloMessage(
            body='Hello, %s!' % user_agent,
            from_user=uuid4(),
        )
    
    
    # Generate type_defs from Pydantic types in the query definition.
    type_defs = generate_gql_schema_str([query])
    
    schema = make_executable_schema(
        type_defs, query, snake_case_fallback_resolvers,
    )
    app = GraphQL(schema, debug=True)
    

    the details could be found here: https://github.com/gary-liguoliang/ariadne-pydantic/blob/master/example/main.py

    With a small schema generation utility,  we managed to run Ariadne in a code-first approach

    • Code is much simpler than the original version of both Ariadne and Graphene 
    • Resuing Pydantic typing 
    • the GQL query definition method is very simple, take input, forward to the core application, return the output. 
  • Speed Up Your Django Tests

    I read the book “Speed Up Your Django Tests” this week, a few interesting items: 

    Background/disclaimer: I’m new to Django,  I use pytest to run many integration Django tests. so the points listed here are purely from my point of view.

    1. Override settings: with @override_settingsin case you want to override a setting for a test method, Django provides the override_settings() decorator (see PEP 318).
    2. Show slow tests with pytest --durations 10
    3. Tests marker, categorize/tag tests so that can run different subsets.  like JUnit categories for more details: https://docs.pytest.org/en/latest/example/markers.html
    4. Reduce pytest test collection by setting norecursedirs
    5. Run in parallel with pytest-xdist
    6. Django’s RequestFactory: This is similar to the test client, but instead of making requests, “provides a way to generate a request instance that can be used as the first argument to any view” Django Doc
    7. Django’s SimpleTestCase:  a subclass of unittest.TestCase, it “disallows database queries by default.”,  however, you till can turn it on.
    8. Avoid Fixture Files[11.1],  “For data you need in individual tests, you’re better off creating it in the test case or test method.” I have to see it’s very easy to set up test data with fixtures, but shortly it becomes unmanageable few valid points: 

      Fixture ˉles are separate from the tests that use them. This makes it hard to determine which tests use which objects. The ˉles tend to become “append-only,”…when a new test needs a new object, it tends to be added to an existing file…if there’s some data that most of your application depends on, using a fixture, causes unnecessary reloading. It will be loaded and then rolled back for each test case, even when the next test case needs the exact same data.

    Overall, I would say it’s a good Django testing book for newbies like me, the book also covers many other topics, such as “Profiling”, “Mocking” etc, and many topics and links for me to explore Django. overall, I would say it’s a good Django testing book for newbies like me.

    However, slow tests generally indicate design issues. all the techniques mentioned in the book definitely can help to speed up the testing(itself), if we take one steps further, should we start thinking about the design?

    Abstraction

    from: Architecture Patterns with Python

    if we cloud fundamentally resolve some design issues, I believe we’ll get much fewer integration tests.

  • How to get a software engineering job in Singapore

    If you’re thinking about it, you might want to start with these two websites:

    1. MyCareersFuture.sg

    This is a portal that aims to provide Singapore Citizens and Permanent Residents with a fast and smart job search service to match them with relevant jobs… The portal was developed by Workforce Singapore, in partnership with the Government Technology Agency.

    This is not just another government website:

    From 1 Aug 2014, under the Fair Consideration Framework (FCF) by the Ministry of Manpower (MOM), companies seeking to hire Employment Pass (EP) holders are required to post their job vacancies on MyCareersFuture.sg for at least 14 calendar days before an EP application is submitted to MOM. For more information on FCF, click here.

    So technically speaking, the HR department will list the job vacancies as early as possible.

    This is not all, most important feature for me is the salary info, even though only a range, it really helps me understand the market (on some level), for example, by reading few lists here, I know that:

    Some of the packages may have different components, but the range gives me much better visibility of the market.

    2. efinancialcareers.sg

    As you may know that many financial institutions have development teams in Singapore, so you better set up your profile at efinancialcareers.sg — even if you have no interest in this industry.

    Setting a profile here will attract most of the recruitment agencies, they will try to talk to you, share their opportunities. Be open-minded, talk with them, they will try their best to find a matching opportunity for you.

    However, other websites such as “Linkedin”, “Indeed” are also very helpful, but for me, “MyCareersFuture” and “efinancialcareers” are the most effective ones.

    Good luck!

  • Google Cloud Tasks: use queue.xml to control rate for slow queues

    We got a service that has an HTTP request rate limit: less than 1 message per 10 seconds. we don’t use this service frequently, but when we use it, we send two requests sequentially, as expected, we received few http 429 errors.

    I kind of agree that it’s my responsibility to control the rate, but I don’t want to let my code aware of these constraints, so we decided to let Google Tasks control the rate.

    First try

    I don’t know much about token bucket, by glancing the help doc, I think it will help by setting --max-dispatches-per-second=0.01  (1 message / 10 seconds) with:

    cloud tasks queues update my-task-queue --max-concurrent-dispatches=1  --max-dispatches-per-second=0.01
    

    however, we noticed that HTTP 429 persists after the change, task queue log shows tasks are dispatched almost at the same time.  until we checked the maxBurstSize

     Each queue has a token bucket that holds tokens, up to the maximum specified by maxBurstSize. Each time a task is dispatched, a token is removed from the bucket. Tasks will be dispatched until the queue’s bucket runs out of tokens. The bucket will be continuously refilled with new tokens based on maxDispatchesPerSecond. and this field is an output value of gcloud, gcloud tasks queues describe my-task-queue shows the maxBurstSize is 10.

    so the bucket should have 10 tokens initially, even though I set the rate, but in my case, the first call will get run immediately because 10 tokens are available right there.  read the document again, and I found:

     In most cases, using the Cloud Tasks API method and letting the system set max_burst_size produces a very efficient rate for managing request bursts. In some cases, however, particularly when the desired rate is relatively slow, either using the queue.yaml method to manually set bucket_size to a small value, or setting your max_concurrent_dispatches to a small value via the Cloud Tasks API can give you more control. https://cloud.google.com/tasks/docs/configuring-queues#rate

    Second try

    set bucket_size to 1 using queue.yaml, task queue log shows tasks are dispatched right at the rate I set. 

    That’s not all,  you’d better to read this one before using queues.xml:

    Pitfalls of mixing queue.yaml and gcloud command

    and these posts also help:


subscribe via RSS