-
My thoughts on Python vs Java
After working in both Python and Java for a while, I want to share my thoughts on the two languages.
Popularity
My current project has a REST API to let users query prices data, our users are mostly big corporations, such as energy companies, financial institutions. To help users getting started with the API, we also provide few code samples, in both Python (our primary language) and Java. We added Java because we think it’s more popular in big companies.
In the last 2 years, we received many questions about the code sample in Python, but 0 question about the Java sample. why? one reason cloud be the Java sample is perfect and everybody understands it well. but after checking the users email signatures, I found that most of the users are not developers, they’re analysts, traders, they just want to copy some simple code and run it.
in terms of popularity in non-developers, for sure Python is the much more popular, IMO:
- if I’m going to start a new project which involves business users, probably I should start with Python.
- if I need to release an SDK related to data, I probably should start with Python.
Delivery Speed
If I need to build some PoC projects quickly, I’ll start with a python script. yes, just a script. why bother with a Java project? The language is simple and easy to communicate with other non-technical users, and most importantly, there are so many open source libraries and modules available.
Bigger projects? more developers?
What happen if the PoC script went well? project getting bigger, more developer joining? we need to be split the big project to smaller modules (but not small projects yet to save the time to release the internal packages). what happen if I want to have a
common
module? currently, we’re using “Poetry”:To depend on a library located in a local directory or file, you can use the path property: my-package = { path = “../my-package/”, develop = false }
This works for simple projects, but for a project with different module structures, some workarounds are needed, such as creating a symbolic link. but in Java side, “Maven” supports this perfectly.
with more developers onboard, more runtime exceptions may happen due to the nature of dynamic language, Java is a much safer language to use. the compiler reduces many runtime errors that cloud happen in Python.
In short, Java is more enterprise-ready than Python, but I believe python is catching up.
Ecosystem, supply chain
“Spring” is the most popular Java framework, backed by a listed company. it almost has everything, for some java developers, they can’t survive without Spring. However, in python side, libraries are less commercialized, which means you may need to raise a pull request to fix your own issue.
-
Pull requests should be treated as database transaction: all kinds of changes should be included
A few weeks ago, I received a request to update the pricing logic for certain products. I made the code change, a silly example:
def get_price(product: Product) -> Decimal: if product.pricing_stragety == "10-percent-off": return product.price * Decimal("0.9") else: return product.price
of course, I also have unit tests to cover the change, everything is fine, so I pushed to production and told my business users that everything is sorted out. I know I also need to update the product configs in database, but I think I can do it manually right after the release.
but I didn’t, before I get back to the “manual” change in db, a production issue reported: price is not discounted, customers are not happy. then I spent hours to fix all the impacted orders.
I cloud easily avoid this issue by adding a db migration script into my pull request. the lesson I learned is to treat a pull request as a database transaction: a pull request should contain all changes: code, data, infra etc,.
-
Python Decorator
In my first few weeks with Python, I was shocked that I cloud pass a function around as a parameter, for example:
def foo(): pass synchronized_foo = synchronized(lock)(foo) synchronized_foo()
and there’s a better version with decorator:
@synchronized(lock) def foo(): pass
Since I come from Java world, I immediately linked this with AOP in Java. but decorator seems so light and easy to use. as described in PEP 318 – Decorators for Functions and Methods
What?! Decorator on a decorator
def covert_to_upper_case(f): """ A simple decorator to covert return string upper case. """ def uppercase(*args, **kwargs): print("upper stats....") r = f(*args, **kwargs) return r.upper() return uppercase def add_prefix(f): """ A simple decorator to add a prefix to return value """ def pre(*args, **kwargs): r = f(*args, **kwargs) return f"[prefix] {r}" return pre def add_prefix_and_covert_to_upper(f): """ A combination of `covert_to_upper_case` and `add_prefix` """ @covert_to_upper_case @add_prefix def covert(*args, **kwargs): r = f(*args, **kwargs) return r # also work: # covert = add_prefix(covert) # covert = covert_to_upper_case(covert) return covert # @add_prefix # @covert_to_upper_case @add_prefix_and_covert_to_upper def hello(): return "Python" print(f"output: {hello()}")
In the above example:
@add_prefix
=add_prefix(f)
,@add_prefix_and_covert_to_upper
=covert_to_upper_case(add_prefix(f))
in a debugger:
hello
is:<function covert_to_upper_case.<locals>.uppercase at 0x10e8da200>
hello
can still be ahello
if ` @wraps(f)` is added in the decorator, e.g.:def covert_to_upper_case(f): @wraps(f) def uppercase(*args, **kwargs): print("upper stats....") r = f(*args, **kwargs) return r.upper() return uppercase
Hello is
<function hello at 0x10a3bd200>
now!@wraps
is a decorator to:Update a wrapper function to look like the wrapped function
What about context manager as Decorator?
contextlib.ContextDecorator
A base class that enables a context manager to also be used as a decorator. Context managers inheriting from ContextDecorator have to implement enter and exit as normal exit retains its optional exception handling even when used as a decorator.
How does it work?
def __call__(self, func): @wraps(func) def inner(*args, **kwds): with self._recreate_cm(): return func(*args, **kwds) return inner
so that a context manager can be used in both way:
@mycontext() def function(): print('The bit in the middle') # or: with mycontext(): print('The bit in the middle')
What about adding more arguments?
example:
def async_task(name: str): def decorator(f): @wraps(f) def wrapper(*args, **kwargs): submit_task(target=f, args=args, kwargs=kwargs, name=name) print(f"{name} task submitted") return wrapper return decorator @async_task("my_task") def my_task(): pass
Summary
- a decorator in Python is a function that takes a function as a parameter
- add
@wraps()
to keep the function signature unchanged
-
GraphQL Server-side Journey with Python
After playing around with few Python GraphQL libraries for a few weeks, I realized that a good GQL python lib should: - be less invasive, work on top of the existing stack (FastAPI/starlette), reuse as much code as possible(Pydantic) - generate GQL schema from python code, ideally from built-in types and Pydantic types - supports
Subscriptions
out of the boxCurrently, I’m happy with
Ariadne
in a code-first approach. This post tracks the journey with issues we found and the workarounds/solutions.Graphene
Both graphql.org and fastapi point to https://graphene-python.org/, so we get started with it.
as you may or may not know, GraphQL has a concept called “Schema”,
Graphene
took “a code-first approach”, which is cool:Instead of writing GraphQL Schema Definition Language (SDL), we write Python code to describe the data provided by your server.
## Hello world works weel, but it’s too verbose
import graphene class Query(graphene.ObjectType): hello = graphene.String(name=graphene.String(default_value="World")) def resolve_hello(self, info, name): return 'Hello ' + name schema = graphene.Schema(query=Query) result = schema.execute('{ hello }') print(result.data['hello']) # "Hello World"
Looks simple yet still complex. there’re too many
graphene
s, why I need to learn another typing system? which can be done by the framework, what about this one?# hello = graphene.String(name=graphene.String(default_value="World")) hello: str = "World"
Reuse Pydantic types with graphene-pydantic
Since we’re using
Pydantic
, which has all the typing details, why not simply usePydantic
?!https://github.com/graphql-python/graphene-pydantic
is exactly what we need! but even withgraphene-pydantic
an adaptor layer is required betweenPydantic
andGraphene
, e.g.:class PersonInput(PydanticInputObjectType): class Meta: model = PersonModel # exclude specified fields exclude_fields = ("id",) class CreatePerson(graphene.Mutation): class Arguments: person = PersonInput() # more code trimmed
Still very verbose, but much better than the original one.
## Subscriptions is not well supported yet
The document is super confusing: https://docs.graphene-python.org/projects/django/en/latest/subscriptions/:
To implement websocket-based support for GraphQL subscriptions, you’ll need to do the following: Install and configure django-channels. Install and configure* a third-party module for adding subscription support over websockets. A few options include: graphql-python/graphql-ws datavance/django-channels-graphql-ws jaydenwindle/graphene-subscriptions Ensure that your application (or at least your GraphQL endpoint) is being served via an ASGI protocol server like daphne (built in to django-channels), uvicorn, or hypercorn.
- Note: By default, the GraphiQL interface that comes with graphene-django assumes that you are handling subscriptions at the same path as any other operation (i.e., you configured both urls.py and routing.py to handle GraphQL operations at the same path, like /graphql).
what? why Django gets mentioned? I’m not interested and I’m lost!
Maybe it’s time to move on.
Ariadne
This is from the Graphene’s “Getting started”:
Compare Graphene’s code-first approach to building a GraphQL API with schema-first approaches like Apollo Server (JavaScript) or Ariadne (Python). Instead of writing GraphQL Schema Definition Language (SDL), we write Python code to describe the data provided by your server.
Yeah, schema-first is not cool, but
Ariadne
’s documetation looks much better than Graphene.Subscriptions, it just works
After the experience with Graphene, the first feature I check was subscriptions: https://ariadnegraphql.org/docs/subscriptionsit’s simple and it just works! the documentation is clean and no
django
mentioned at all!import asyncio from ariadne import SubscriptionType, make_executable_schema from ariadne.asgi import GraphQL type_def = """ type Query { _unused: Boolean } type Subscription { counter: Int! } """ subscription = SubscriptionType() @subscription.source("counter") async def counter_generator(obj, info): for i in range(5): await asyncio.sleep(1) yield i @subscription.field("counter") def counter_resolver(count, info): return count + 1 schema = make_executable_schema(type_def, subscription) app = GraphQL(schema, debug=True)
Schema first? it doesn’t have to be
What if I change the
counter_generator
to returnstr
? I need to update the Schema. if I forgot that, I’m lying to my users. I hate it.In the above example,
type_def
is kind fo the duplication of the methodcounter_generator
(if we add return type) like:async def counter_generator(obj, info) -> int
the Schema looks reasonably easy to generate, why cannot we generate a schema from python code? especially with
Pydantic
? if we define a method with proper tying, we cloud to generate the Schema easily:class HelloMessage(BaseModel): body: str from_user: UUID query = QueryType() @query.field('hello') def resolve_hello(_, info) -> HelloMessage: request = info.context['request'] user_agent = request.headers.get('user-agent', 'guest') return HelloMessage( body='Hello, %s!' % user_agent, from_user=uuid4(), ) # Generate type_defs from Pydantic types in the query definition. type_defs = generate_gql_schema_str([query]) schema = make_executable_schema( type_defs, query, snake_case_fallback_resolvers, ) app = GraphQL(schema, debug=True)
the details could be found here: https://github.com/gary-liguoliang/ariadne-pydantic/blob/master/example/main.py
With a small schema generation utility, we managed to run
Ariadne
in acode-first
approach:- Code is much simpler than the original version of both
Ariadne
andGraphene
- Resuing Pydantic typing
- the GQL query definition method is very simple, take input, forward to the core application, return the output.
-
Speed Up Your Django Tests
I read the book “Speed Up Your Django Tests” this week, a few interesting items:
Background/disclaimer: I’m new to Django, I use
pytest
to run many integration Django tests. so the points listed here are purely from my point of view.- Override settings: with
@override_settings
, in case you want to override a setting for a test method, Django provides the override_settings() decorator (see PEP 318). - Show slow tests with
pytest --durations 10
- Tests marker, categorize/tag tests so that can run different subsets. like JUnit categories for more details: https://docs.pytest.org/en/latest/example/markers.html
- Reduce pytest test collection by setting
norecursedirs
- Run in parallel with
pytest-xdist
- Django’s
RequestFactory
: This is similar to the test client, but instead of making requests, “provides a way to generate a request instance that can be used as the first argument to any view” Django Doc - Django’s
SimpleTestCase
: a subclass ofunittest.TestCase
, it “disallows database queries by default.”, however, you till can turn it on. - Avoid Fixture Files[11.1], “For data you need in individual tests, you’re better off creating it in the test case or test method.” I have to see it’s very easy to set up test data with
fixtures
, but shortly it becomes unmanageable few valid points:Fixture ˉles are separate from the tests that use them. This makes it hard to determine which tests use which objects. The ˉles tend to become “append-only,”…when a new test needs a new object, it tends to be added to an existing file…if there’s some data that most of your application depends on, using a fixture, causes unnecessary reloading. It will be loaded and then rolled back for each test case, even when the next test case needs the exact same data.
Overall, I would say it’s a good Django testing book for newbies like me, the book also covers many other topics, such as “Profiling”, “Mocking” etc, and many topics and links for me to explore Django. overall, I would say it’s a good Django testing book for newbies like me.
However, slow tests generally indicate design issues. all the techniques mentioned in the book definitely can help to speed up the testing(itself), if we take one steps further, should we start thinking about the design?
from: Architecture Patterns with Python
if we cloud fundamentally resolve some design issues, I believe we’ll get much fewer integration tests.
- Override settings: with
subscribe via RSS