• Time Series Analysis with Python

    I have a log file contains numbers indexed in time, I want to generate time series chart to display the data to my users. However I’m new to data visualization, so I’m tracking what I did in this post.

    Environment Setup

    Jupyter is a nice tool to playaround with python, so get it installed or get a docker instance:

    docker run --rm -p 8888:8888 -v "$PWD":/Users/gouliang/jupyter-home jupyter/scipy-notebook
    

    if you’re not sure about which image to choose: https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html

    url with access token will be displayed on the console.

    Hello world

    start a python notebook and run:

    ## https://stackoverflow.com/questions/19079143/how-to-plot-time-series-in-python
    
    import matplotlib.pyplot as plt
    import datetime
    import numpy as np
    
    x = np.array([datetime.datetime(2013, 9, 28, i, 0) for i in range(24)])
    y = np.random.randint(100, size=x.shape)
    
    plt.plot(x,y)
    plt.show()
    

    a nice time series chart will be displayed.

    #WIP

  • Orika with Spring-boot Devtools MappingException/ClassCastException: Mapper Cannot be Cast to GeneratedObjectBase

    I was trying to run a spring-boot application which uses an Orika mapper from one of it’s dependencies (the dependency is released as a jar):

    Caused by: ma.glasnost.orika.MappingException: java.lang.ClassCastException: ma.glasnost.orika.generated.Orika_UserDest_UserSource_Mapper3053159455512$0 cannot be cast to ma.glasnost.orika.impl.GeneratedObjectBase
    	at ma.glasnost.orika.impl.generator.MapperGenerator.build(MapperGenerator.java:104) ~[orika-core-1.5.1.jar:na]
    	at ma.glasnost.orika.impl.DefaultMapperFactory.buildMapper(DefaultMapperFactory.java:1480) ~[orika-core-1.5.1.jar:na]
    	at ma.glasnost.orika.impl.DefaultMapperFactory.build(DefaultMapperFactory.java:1295) ~[orika-core-1.5.1.jar:na]
    	at ma.glasnost.orika.impl.DefaultMapperFactory.getMapperFacade(DefaultMapperFactory.java:883) ~[orika-core-1.5.1.jar:na]
    	at ma.glasnost.orika.impl.ConfigurableMapper.init(ConfigurableMapper.java:121) ~[orika-core-1.5.1.jar:na]
    	at ma.glasnost.orika.impl.ConfigurableMapper.<init>(ConfigurableMapper.java:97) ~[orika-core-1.5.1.jar:na]
    	at com.liguoliang.common.mapper.UserMapper.<init>(UserMapper.java:8) ~[common-1.0-SNAPSHOT.jar:na]
    	at com.liguoliang.springboot1.helloworld.TestApp.run(TestApp.java:30) ~[classes/:na]
    	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:732) [spring-boot-1.5.15.RELEASE.jar:1.5.15.RELEASE]
    	... 11 common frames omitted
    Caused by: java.lang.ClassCastException: ma.glasnost.orika.generated.Orika_UserDest_UserSource_Mapper3053159455512$0 cannot be cast to ma.glasnost.orika.impl.GeneratedObjectBase
    	at ma.glasnost.orika.impl.generator.SourceCodeContext.getInstance(SourceCodeContext.java:262) ~[orika-core-1.5.1.jar:na]
    	at ma.glasnost.orika.impl.generator.MapperGenerator.build(MapperGenerator.java:73) ~[orika-core-1.5.1.jar:na]
    	... 19 common frames omitted
    

    I was confused by this error because the class Orika_UserDest_UserSource_Mapper3053159455512$0 obviously is a generated class and it supposed to work with Orika itself.

    to illustrate the project structure:

    the spring-boot I’m currently working on:

    <project>
    	<groupId>com.liguoliang</groupId>
    	<artifactId>spring-boot-1.5-hello-world</artifactId>
    	<version>0.0.1-SNAPSHOT</version>
    	<packaging>jar</packaging>
    
    	<parent>
    		<groupId>org.springframework.boot</groupId>
    		<artifactId>spring-boot-starter-parent</artifactId>
    		<version>1.5.15.RELEASE</version>
    		<relativePath/> <!-- lookup parent from repository -->
    	</parent>
    
    	<dependencies>
            <dependency>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-starter</artifactId>
            </dependency>
    		<dependency>
    			<groupId>com.liguoliang</groupId>
    			<artifactId>common-demo</artifactId>
    			<version>0.0.1</version>
    		</dependency>
            <dependency>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-devtools</artifactId>
                <optional>true</optional>
            </dependency>
        </dependencies>
     </project>
    

    the dependency jar file’s pom: (which is where the mapper come form):

    <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
    
        <groupId>com.liguoliang</groupId>
        <artifactId>common-demo</artifactId>
        <version>0.0.1</version>
    
        <packaging>jar</packaging>
    
        <dependencies>
            <dependency>
                <groupId>net.rakugakibox.spring.boot</groupId>
                <artifactId>orika-spring-boot-starter</artifactId>
                <version>1.5.0</version>
            </dependency>
        </dependencies>
    </project>
    

    What I found by debugging

    the issue happens here:

    // SourceCodeContext.java
        public <T extends GeneratedObjectBase> T getInstance() throws SourceCodeGenerationException, InstantiationException,
                IllegalAccessException {
            T instance = (T) compileClass().newInstance();
            ...
            }
    

    the compileClass() shows that the source of the generated class is:

    package ma.glasnost.orika.generated;
    public class Orika_UserDest_UserSource_Mapper3053159455512$0 extends ma.glasnost.orika.impl.GeneratedMapperBase {
    ...
    

    GeneratedMapperBase extends GeneratedObjectBase, by reading the code, I’m sure that compileClass().newInstance() must be a GeneratedObjectBase .

    but issue happened, consistently. in the runtime, instance a of A is not a A, that could be caused by A was loaded in multiple loader? I found:

    • GeneratedClass.class was loaded by Launcher$AppClassLoader
    • compileClass() and compileClass().getSuperclass().getSuperclass()(which is GeneratedObjectBase) were loaded by RestartClassLoader, RestartClassLoader.parent is the Launcher$AppClassLoader

    so the error message can be enhanced to show that: jvm failed to cast instance of GeneratedObjectBase loaded by RestartClassLoader to GeneratedObjectBase loaded by AppClassLoader

    Why GeneratedObjectBase got loaded by two different class loaders?

    • before the mapper get actually initialized, the mapper(part of the common-demo lib) get loaded by AppClassLoader, probably together with the mapper, GeneratedClassBase also get loaded by AppClassLoader
    • when the mapper getting initialized, spring-boot devtools will use the RestartClassLoader to load classes in the workspace of the IDE, to get faster ‘reload’:
      • Orika used the RestartClassLoader to compile the generated class:
         // JavassistCompilerStrategy.compileClass()
          compiledClass = byteCodeClass.toClass(Thread.currentThread().getContextClassLoader(), this.getClass().getProtectionDomain());
        

        so the generated class and the parent classes all get loaded by RestartClassLoader that how did the exception happen.

    How to fix?

    Conclusion

    Restart functionality from Spring-boot devtools can be very helpful when working in an IDE, the application get restarted automatically on class change. however issues can happen when class loaders get messed up. e.g. due to the dependency structure, GeneratedObjectBase was loaded by two loaders which caused the exception.

  • Always Check Build Lock File into Source Control

    one of my teammates (a frontend developer) came to me with question: “why my CI build is failing? it was fine yesterday and I just tried it works fine locally!”

    I don’t have much experience with the frontend, but I don’t believe any magic in computer science, everything has a reason, so I stopped my task and start reading the CI output:

    > no yarn.lock detected, using npm
    > npm run build
    .....
    ...another-lib.js v1.3.1
    

    humm, then I questioned back to my teammate, “how did you build your local project? “, “yarn build”. I found “another-lib.js 1.3.0” in his local yarn.lock file, and as the ci output pointed out: yarn.lock is not version controlled.

    why a build lock file has to be checked in?

    All yarn.lock files should be checked into source control (e.g. git or mercurial). This allows Yarn to install the same exact dependency tree across all machines, whether it be your coworker’s laptop or a CI server. (https://yarnpkg.com/lang/en/docs/yarn-lock/#toc-check-into-source-control)

    Conclusions

    I never used yarn before, but the fist listen I learned is: commit yarn.lock file! Please commit your yarn.lock files

  • Java String.intern() and String Pooling

    String is immutable in java, JVM can optimize the memory usage if only one copy of each literal string gets stored in a pool.

    what is String.intern()?

    java 1.8:
        /**
         * Returns a canonical representation for the string object.
         * <p>
         * A pool of strings, initially empty, is maintained privately by the
         * class {@code String}.
         * <p>
         * When the intern method is invoked, if the pool already contains a
         * string equal to this {@code String} object as determined by
         * the {@link #equals(Object)} method, then the string from the pool is
         * returned. Otherwise, this {@code String} object is added to the
         * pool and a reference to this {@code String} object is returned.
         * <p>
         * It follows that for any two strings {@code s} and {@code t},
         * {@code s.intern() == t.intern()} is {@code true}
         * if and only if {@code s.equals(t)} is {@code true}.
         * <p>
         * All literal strings and string-valued constant expressions are
         * interned. String literals are defined in section 3.10.5 of the
         * <cite>The Java&trade; Language Specification</cite>.
         *
         * @return  a string that has the same contents as this string, but is
         *          guaranteed to be from a pool of unique strings.
         */
        public native String intern();
    

    Try out:

    String s2 = new String("another-random-string"); // "another-random-string" is part of the pool during jvm initializing.
    System.out.println(s2 == s2.intern()); // false. s2.inter() returns reference to the existing reference from the string pool.
    
    String s1 = new String(new StringBuilder("this").append("-is-a-random-string")); // this-is-a-random-string will be totally new to jvm during runtime.
    System.out.println(s1 == s1.intern()); // true. s1.intern() adds s1 to the string pool.
    

    with -XX:+PrintStringTableStatistics:

    StringTable statistics:
    Number of buckets       :     60013 =    480104 bytes, avg   8.000
    Number of entries       :       883 =     21192 bytes, avg  24.000
    Number of literals      :       883 =     57864 bytes, avg  65.531
    Total footprint         :           =    559160 bytes
    Average bucket size     :     0.015
    Variance of bucket size :     0.015
    Std. dev. of bucket size:     0.121
    Maximum bucket size     :         2
    

    Conclusion

    • All literal strings and string-valued constants are referring to a string inside the pool
    • when an intern() method is invoked, a reference from the pool will be returned. if the string is not existing in the pool, the string will be added

    a deeper post: http://java-performance.info/string-intern-in-java-6-7-8/

  • Does Element Sequence Matter in XML? It Depends.

    I was reviewing a pull request which integrates an existing app with an external SOAP payment gateway. the gateway looks very simple, given a request like:

    <collectMoney>
        <fromAccount>111</fromAccount>
        <amount>999<amount>
    </collectMoney>
    <more-details>...</more-details>
    

    the payment gateway returns:

    <result>
        <status>success</status>
    </result>
    

    it works fine when I test it with my HTTP client.

    by the WSDL shared by the payment gateway, fromAccount and amount must keep in sequence. however, I found an template inside the pull request: (sequence is different)

    <collectMoney>
        <amount><amount>
        <fromAccount></fromAccount>
    </collectMoney>
    <more-details>...</more-details>
    

    this means the app actually is generating an invalid XML request? How does this work?

    it’s good to see that an app just works, but why??

    I digged into the source code, and this is how it works: create CollectMoney(the class generated by XSD) -> render a XML using the template -> JAXB unmashaller -> CollectMoney instance -> Send to payment gateway via Apache CXF

    I enabled apache CXF loggers to print the outbound/inbound messages, the log shows that the app managed to send out correct sequence! what kind of magic is this?

    it’s the JAXB unmarshaller:

    // copied from JAXB Unmarshaller.java
     * Since unmarshalling invalid XML content is defined in JAXB 2.0,
     * the Unmarshaller default validation event handler was made more lenient
     * than in JAXB 1.0.  When schema-derived code generated
     * by JAXB 1.0 binding compiler is registered with {@link JAXBContext},
     * the default unmarshal validation handler is
     * {@link javax.xml.bind.helpers.DefaultValidationEventHandler} and it
     * terminates the marshal  operation after encountering either a fatal error or an error.
     * For a JAXB 2.0 client application, there is no explicitly defined default
     * validation handler and the default event handling only
     * terminates the unmarshal operation after encountering a fatal error.
    

    what is an error by the definition of W3C?

    //copied from org.xml.ErrorHandler.java
    This corresponds to the definition of "error" in section 1.2
     * of the W3C XML 1.0 Recommendation.  For example, a validating
     * parser would use this callback to report the violation of a
     * validity constraint.  The default behaviour is to take no
     * action.
    

    https://www.w3.org/TR/xml/#sec-terminology show the error:

    [Definition: A violation of the rules of this specification; results are undefined. Unless otherwise specified, failure to observe a prescription of this specification indicated by one of the keywords must, required, must not, shall and shall not is an error. Conforming software may detect and report an error and may recover from it.]

    and it also mentions that a seq:

    content particles occurring in a sequence list must each appear in the element content in the order given in the list.

    so incorrect sequence is an error and by default JAXB 2.0 will continue to process it.

    in my app’s case, an invalid XML request get unmarshalled to java instance, and then the instance gets ‘marshalled’ to an valid XML and then send to the payment gateway.

    Conclusion

    by definition, sequence in a ‘sequence’ does matter. however in the context of unmarshalling, it depends on how does an XML document get unmarshalled.

    Abstraction leaks, the unmarshalling layer doesn’t add much value, but brings unnecessary confusion by leaking incorrect info, the app could directly use the classes generated by the WSDL/XSD.


subscribe via RSS