Edit from July 5th, 2019:

This post was written before the release of JPype 0.7.0. This new version of JPype introduces a nicer syntax to implement python interfaces, and solves one issue I mention below with passing of numpy arrays.

Check the new JPype documentation for more details, they also improved that part quite a bit!

If you have been doing any kind of scientific work in the last years, you probably noticed the trend: Python is growing fast as the language of choice for more and more tasks. At MATSim, this lead to more and more requests for a “Python API”, “Python Bindings”, or “I only know Python, how do I start?”. MATSim is written in Java, so at the beginning, my answer was “Java is not that complicated, and we do not have capacity to support Python, so I am afraid you will have to use Java”. With time, however, this answer became less and less satisfying, as there are good reasons to use python to control a scientific computing program:

  • there are great libraries to process and analyse data in Python, with active communities. Pandas, Matplotlib and other SciPy packages are great.
  • in contrast to other popular languages in the scientific community (looking at you, R), you can actually go beyond the simple script in Python without loosing your sanity.
  • To my surprise, I found Python much easier to teach than Java. I personally love how statically typed languages support you, but I found that lots of students just do not seem to grasp the idea of a type hierarchy fast enough for it to be taught “on the side”. Python also makes more sense if data manipulation and analysis is expected.

Good, so our users and students know Python, Python is better at data manipulation than most languages, and we do not want to teach both Java and Python to non-computer majors. Can we provide a simple Python API to a Java program, that is:

  • close to the Java API
  • understandable by novice python users, and
  • does not kill performance?

In this blog post, I will try to convince you we can. While the experiments were done in the context of MATSim, I will work here on a simple idealized simulation framework, that will both be easier to understand for readers and help in performance analysis. What we will have at the end is a fully functional Python API for a simulation written in Java.

All the code for this blog post is available on GitHub.

An Idealized Simulation Software

To play around and have something to measure, we will need some minimal simulation software that is fast (so that what we measure is what we are interested in) and yet has a realistic structure. We will do this using a kindergarten simulation:

  • a kindergarten is composed by a collection of agents (children)
  • at each time step, each child can do one of the following:
    • jump
    • sing
    • speak
  • if jumping, the height will vary randomly
  • if singing, they will choose one song at random
  • if speaking, they will just say random nonsense

Our simulation software allows us to analyze what happens in our super-realistic kindergarten by allowing us to listen for events. You might be familiar with the concept from interactive programs: every time a child jumps, sings or speaks, an object is created containing all relevant information, and passed to event handlers the user registered. For the following, I will assume that information flows from the simulation to the handlers only, that is, event handlers are not allowed to influence the simulation (by making children sing the same song as the majority of their friends for instance). This is mostly a design decision when designing the Python API: the simulation is written in Java, and the Python API is there mostly to enable writing analysis code directly in the “natural” language of analysis. Implementation of new behavior is, for the moment, kept out of scope.

To listen for events, the user needs to implement interfaces such as this one:

public interface JumpEventHandler extends EventHandler {
   void handleEvent(JumpEvent event);
}

where EventHandler is defined as

public interface EventHandler {
    void notifyEnd();
    void notifyStart();
}

So the user gets notified when:

  • the simulation starts. This can be useful to reset internal structure or open a file, for instance
  • an event of a given type occurs. The user can do whatever needs to be done with that information
  • the simulation ends. Cleanup can occur there, or closing a file.

Implementing several of those interfaces at the same time allows to list to several types of events at the same time. For instance, to print everything that happens:

public class LoggingEventHandler implements JumpEventHandler, SpeakEventHandler, SingEventHandler {
    @Override
    public void handleEvent(JumpEvent event) {
        System.out.println(event.getAgentId()+
                " jumps "+event.getHeight_m()+"m"+
                " at time "+event.getTime());
    }

    @Override
    public void handleEvent(SingEvent event) {
        System.out.println(event.getAgentId()+
                " sings "+event.getSong()+
                " at time "+event.getTime());
    }

    @Override
    public void handleEvent(SpeakEvent event) {
        System.out.println(event.getAgentId()+
                " says "+event.getMessage()+
                " at time "+event.getTime());
    }

    @Override
    public void notifyEnd() {
        System.out.println("end simulation");
    }

    @Override
    public void notifyStart() {
        System.out.println("start simulation");
    }
}

Adding our handler to the simulation happens by doing:

public class RunSimpleSimulation {
    public static void main(String... args) {
        // Create a simulation with 3 agents and 3 time steps
        Simulation simulation = new Simulation(3, 3);

        // Add our handler to it
        simulation.getEvents().addEventHandler(new LoggingEventHandler());

        // Run it!
        simulation.run();
    }
}

Which gives the output

start simulation
0 jumps 0.27332631373580385m at time 0.0
1 jumps 1.5435972766632988m at time 0.0
2 jumps 3.538552944775593m at time 0.0
0 jumps 4.516861323360891m at time 1.0
1 says �\�a�l� at time 1.0
2 jumps 2.3182678790457665m at time 1.0
0 sings NinetyNineBottles at time 2.0
1 jumps 0.7597912428148812m at time 2.0
2 says ����R�= at time 2.0
end simulation

Looks realistic enough to me.

JPype or Jnius: “You Already Have a Python API, You Just Do Not Know It Yet”

OK, so we have a simple simulation software, that we want to construct a Python API for. Ideally, that API should:

  • look similar to the Java counterpart, to allow users to easily switch back and forth between the two languages
  • be reasonably efficient

Fortunately, there are two libraries that claim to give those points out of the box: JPype and Jnius. Looking at their GitHub pages here and here, both have about the same amount of activity, with one or two main contributors and a few isolated pull requests. This is definitely a factor to consider before going all-in with using them: development might stop unexpectedly, and bugs can take time to get fixed, if at all.

Both look pretty similar in their use. The simple run script above looks like so in JPype:

import jpype
import jpype.imports as jimport

jimport.registerDomain('ch')
jpype.addClassPath("path/to/demo-python-java-api-1.0-SNAPSHOT-jar-with-dependencies.jar")
jpype.startJVM(jpype.get_default_jvm_path(),
        "-Djava.class.path=%s" % jpype.getClassPath())

from ch.dubernet.demopythonapi.simulation import Simulation
from ch.dubernet.demopythonapi.simulation.events import LoggingEventHandler

simulation = Simulation(3, 3)
simulation.getEvents().addEventHandler(LoggingEventHandler())
simulation.run()

and so in Jnius:

import jnius_config

jnius_config.set_classpath("path/to/demo-python-java-api-1.0-SNAPSHOT-jar-with-dependencies.jar")

import jnius


Simulation = jnius.autoclass(
        'ch.dubernet.demopythonapi.simulation.Simulation')
BenchmarkEventHandler = jnius.autoclass(
        'ch.dubernet.demopythonapi.simulation.events.LoggingEventHandler')

simulation = Simulation(3, 3)
simulation.getEvents().addEventHandler(LoggingEventHandler())

Apart from minor syntactic differences, both look similar, readable, and claim to be as fast as can be by directly interfacing the Python and Java Virtual Machines using the Java Native Interface, or JNI.

There are more differences that cannot be seen on these simple examples, in particular for how to implement java interfaces. Do not worry, we will come to this in a minute. Overall, Jnius feels a bit closer to Java, in particular in that it allows you to specify method signatures, but has other issues, to which we will come back.

Costs of JPype and Jnius

Both libraries already provide a simple API “out of the box”, but they have two kinds of hidden performance costs:

  • communication costs: each time Python code calls a method on a Java object, or Java code calls a method on a Python implementation of an interface, the two virtual machines have to communicate, which is not free.
  • conversion costs: Every time a Java object is returned to Python code, or the other way, some conversion has to take place. This, again, is not instantaneous.

For lots of things, those costs might be perfectly acceptable. But sometimes, they are too much.

Another issue with just using those libraries directly is that the fundamental differences between Java and Python, both technical and cultural, might make it pretty awkward to do complicated actions with those libraries. In particular, it is not unusual for Java libraries to rely on IDE support quite a lot. Just lines like the simulation.getEvents().addEventHandler(SomeHandler()) above are no problem in Java, because the IDE will know simulation has a method named getEvents() that returns an object that has a method addEventHandler that expects an object implementing the EventHandler interface, and will help you all the way by suggestion and error messages. With Python, on the other way, you are pretty much on your own to figure that out by going through the documentation. That makes just using those libraries potentially difficult, which goes against our aim to make our Java program easier to teach!

A Closer Look at Communication Costs

Let’s consider the Python line simulation.getEvents().addEventHandler(SomeHandler()) again, and let’s try to figure out what happens:

  • you call the method getEvents(). The library then call the JVM, gets the return value, converts it one way or another
  • you then call addEventHandler(SomeHandler()) on that return value. The library needs to convert SomeHandler() to something the JVM understand, pass it to it, and potentially expect a return value and convert it again

That is quite some back and forth. It is not a big deal performance-wise for that line, which is called once, but can become a problem for things that are executed a lot, such as analyzing the events.

Reducing Communication Costs Using Protocol Buffers

This communication is sometimes unavoidable, but in our case here, some of it can be gotten rid off, as we mostly want to pass immutable data objects around, for which one can directly communicate the whole information at once. The idea is to first convert the data to one single data structure on the Java side, pass that data as a block to Python and decode it again there. Json would be a natural candidate, but most likely will have poor performance. A much better choice are Google’s Protocol Buffers, which use a compact binary format and provide code generators for Java and Python, between others.

Protocol buffers allow you to specify your data format in a readable format, and takes care of generating converters to and from byte arrays for those formats. In the case of our three types of event, the format look like this:

syntax = "proto3";

option java_package = "ch.dubernet.demopythonapi.simulation.protobuf";

message JumpEvent {
    string agentId = 1;
    double time = 2;
    double height_m = 3;
}

message SingEvent {
    string agentId = 1;
    double time = 2;

    enum Song {
        OLD_MC_DONALDS = 0;
        IF_YOU_ARE_HAPPY = 1;
        NINETY_NINE_BOTTLES = 2;
    }

    Song song = 3;
}

message SpeakEvent {
    string agentId = 1;
    double time = 2;
    // cannot name this "message", because this is a reserved keyword in protobuf
    string text = 3;
}

This is readable enough!

The Experiment

Now, let’s try to implement some event handling in Python, and measure performance. As a benchmark, our event handling code will simply consist of accessing all fields of the events, so that it only measures the communication and conversion overhead.

The benchmark Java code looks like this:

package ch.dubernet.demopythonapi.simulation.events;

/**
 * An event handler that simply calls every getter on the events,
 * to compare with the cost of accessing them from
 * Python. Totally useless outside of the performance benchmark.
 */
public class BenchmarkEventHandler implements
            SpeakEventHandler,
            SingEventHandler,
            JumpEventHandler {
    @Override
    public void handleEvent(JumpEvent event) {
        event.getAgentId();
        event.getHeight_m();
        event.getTime();
    }

    @Override
    public void handleEvent(SingEvent event) {
        event.getAgentId();
        event.getSong();
        event.getTime();
    }

    @Override
    public void handleEvent(SpeakEvent event) {
        event.getAgentId();
        event.getMessage();
        event.getTime();
    }

    @Override
    public void notifyEnd() {

    }

    @Override
    public void notifyStart() {

    }
}

Base Case: Using the Java Implementation

The baseline we will compare other approaches to will be using the Java implementation, and measure performance. To reduce the impact of natural fluctuations in wall clock execution time, we will average over 100 simulations.

import time

from ch.dubernet.demopythonapi.simulation import Simulation
from ch.dubernet.demopythonapi.simulation.events import BenchmarkEventHandler

simulation = Simulation(100, 100)
simulation.getEvents().addEventHandler(BenchmarkEventHandler())

total_time_s = 0

for i in range(100):
    total_time_s -= time.time()
    simulation.run()
    total_time_s += time.time()
    
print(total_time / 100)
0.0021571803092956545

OK, so something in the order of a 100th of a second is our target!

Worst Case: Using the Libraries Naively

The first method to try is to simply use the libraries without any sort of special care, and see what it gives. Let’s start with JPype:

import time 

import jpype
import jpype.imports as jimport

jimport.registerDomain('ch')
jpype.addClassPath("path/to/demo-python-java-api-1.0-SNAPSHOT-jar-with-dependencies.jar")
jpype.startJVM(jpype.get_default_jvm_path(), "-Djava.class.path=%s" % jpype.getClassPath())

from ch.dubernet.demopythonapi.simulation import Simulation
from ch.dubernet.demopythonapi.simulation.events import JumpEventHandler, SingEventHandler, SpeakEventHandler, JumpEvent, SingEvent, SpeakEvent

simulation = Simulation(100, 100)

class PythonBenchmarkHandler:
    def notifyStart(self):
        pass
    
    def notifyEnd(self):
        pass
    
    def handleEvent(self, event):
        # Here we see one of the caveat of JPype's way to handle
        # interface implementation:
        # we need to manually perform polymorphic dispatching
        if isinstance(event, JumpEvent):
            event.getAgentId()
            event.getHeight_m()
            event.getTime()
        elif isinstance(event, SingEvent):
            event.getAgentId()
            event.getSong()
            event.getTime()
        elif isinstance(event, SpeakEvent):
            event.getAgentId()
            event.getMessage()
            event.getTime()
        else:
            raise Exception()

handler = jpype.JProxy((JumpEventHandler,
                        SingEventHandler,
                        SpeakEventHandler),
                    inst=PythonBenchmarkHandler())
simulation.getEvents().addEventHandler(handler)

total_time_s = 0

for i in range(100):
    total_time_s -= time.time()
    simulation.run()
    total_time_s += time.time()
    
print(total_time / 100)

The JProxy line is the way JPype handles implementing interfaces. It creates a proxy Java object that implements the interfaces and will forward calls to Python methods of the same name. Note how only the name, not the signature matters.

The output is:

0.5479076290130616

This is two orders of magnitude slower than the pure Java version! That sucks.

Let’s see whether Jnius is any better:

import jnius_config

jnius_config.set_classpath("path/to/demo-python-java-api-1.0-SNAPSHOT-jar-with-dependencies.jar")

import jnius

import time

Simulation = jnius.autoclass('ch.dubernet.demopythonapi.simulation.Simulation')
BenchmarkEventHandler = jnius.autoclass('ch.dubernet.demopythonapi.simulation.events.BenchmarkEventHandler')

JumpEventHandler = 'ch/dubernet/demopythonapi/simulation/events/JumpEventHandler'
SingEventHandler = 'ch/dubernet/demopythonapi/simulation/events/SingEventHandler'
SpeakEventHandler = 'ch/dubernet/demopythonapi/simulation/events/SpeakEventHandler'

class PythonBenchmarkHandler(jnius.PythonJavaClass):
    __javainterfaces__ = [JumpEventHandler,
                          SingEventHandler,
                          SpeakEventHandler]

    @jnius.java_method('()V')
    def notifyStart(self):
        pass
    
    @jnius.java_method('()V')
    def notifyEnd(self):
        pass
    
    @jnius.java_method(
        '(Lch/dubernet/demopythonapi/simulation/events/JumpEvent;)V',
        name='handleEvent')
    def handleJumpEvent(self, event):
            event.getAgentId()
            event.getHeight_m()
            event.getTime()

            
    @jnius.java_method(
        '(Lch/dubernet/demopythonapi/simulation/events/SingEvent;)V',
        name='handleEvent')
    def handleSingEvent(self, event):
            event.getAgentId()
            event.getSong()
            event.getTime()

            
    @jnius.java_method(
        '(Lch/dubernet/demopythonapi/simulation/events/SpeakEvent;)V',
        name='handleEvent')
    def handleSpeakEvent(self, event):
            event.getAgentId()
            event.getMessage()
            event.getTime()

simulation = Simulation(100, 100)
handler = PythonBenchmarkHandler()
simulation.getEvents().addEventHandler(handler)

total_time_s = 0

for i in range(100):
    total_time_s -= time.time()
    simulation.run()
    total_time_s += time.time()
    
print(total_time / 100)

Notice the difference in syntax with JPype: here, we can actually implement one method per signature. The format for the signature is the one from the JNI specification, which is unfortunately not the most readable. One still needs to call methods differently on the Python side, but otherwise, it gets pretty close to how one would do it in Java. The output is:

1.2619202899932862

This is roughly twice as bad as JPype. Not good.

Hopefully Better: Using Protocol Buffers

Now let’s see if reducing the amount of back-and-forth between the VMs helps in getting those figures down.

Let’s first implement the Java infrastructure to send events in protocol buffer format. First we need an interface to implement when we want to receive protocol buffer events:

package ch.dubernet.demopythonapi.simulation.api;

public interface ProtobufEventHandler {
    void notifyStart();
    void notifyEnd();

    // Methods need to be specific to an event type for two reasons:
    // - dynamic dispatch obviously does not work
    // - protocol buffer conversion requires knowing the type of message beforehand
    void handleJumpEventMessage(byte[] message);
    void handleSingEventMessage(byte[] message);
    void handleSpeakEventMessage(byte[] message);
}

Then we need an event handler that is able to do the conversion:

package ch.dubernet.demopythonapi.simulation.api;

import ch.dubernet.demopythonapi.simulation.events.*;
import ch.dubernet.demopythonapi.simulation.protobuf.JumpEventOuterClass;
import ch.dubernet.demopythonapi.simulation.protobuf.SingEventOuterClass;
import ch.dubernet.demopythonapi.simulation.protobuf.SpeakEventOuterClass;

public class ProtobufAdapter implements
                JumpEventHandler,
                SingEventHandler,
                SpeakEventHandler {
    private final ProtobufEventHandler delegate;

    public ProtobufAdapter(ProtobufEventHandler delegate) {
        this.delegate = delegate;
    }

    @Override
    public void handleEvent(JumpEvent event) {
        final JumpEventOuterClass.JumpEvent pb =
                ProtobufUtils.toProtocolBuffer(event);

        delegate.handleJumpEventMessage(pb.toByteArray());
    }

    @Override
    public void handleEvent(SingEvent event) {
        final SingEventOuterClass.SingEvent pb =
                ProtobufUtils.toProtocolBuffer(event);

        delegate.handleSingEventMessage(pb.toByteArray());
    }

    @Override
    public void handleEvent(SpeakEvent event) {
        final SpeakEventOuterClass.SpeakEvent pb =
                ProtobufUtils.toProtocolBuffer(event);

        delegate.handleSpeakEventMessage(pb.toByteArray());
    }

    @Override
    public void notifyEnd() {
        this.delegate.notifyEnd();
    }

    @Override
    public void notifyStart() {
        this.delegate.notifyStart();
    }
}

Everything in the ch.dubernet.demopythonapi.simulation.protobuf event is auto-generated from the protocol buffer definitions. The rest is self-explanatory: really nothing complicated there.

Now implement something for the python side:

# Assume the JVM is started
from ch.dubernet.demopythonapi.simulation.api import ProtobufAdapter

from api.protobuf.JumpEvent_pb2 import JumpEvent
from api.protobuf.SingEvent_pb2 import SingEvent
from api.protobuf.SpeakEvent_pb2 import SpeakEvent

def create_event_handler(handler):
    """
    Create a JProxy object that converts events in protocol buffer format
    """
    class ProtobufHandler:
        def notifyStart(self):
            if hasattr(handler, "notifyStart"):
                handler.notifyStart()

        def notifyEnd(self):
            if hasattr(handler, "notifyEnd"):
                handler.notifyEnd()

        def handleJumpEventMessage(self, message):
            if hasattr(handler, "handleJumpEvent"):
                event = JumpEvent()
                event.ParseFromString(message[:])
                handler.handleJumpEvent(event)

        def handleSingEventMessage(self, message):
            if hasattr(handler, "handleSingEvent"):
                event = SingEvent()
                event.ParseFromString(message[:])
                handler.handleSingEvent(event)

        def handleSpeakEventMessage(self, message):
            if hasattr(handler, "handleSpeakEvent"):
                event = SpeakEvent()
                event.ParseFromString(message[:])
                handler.handleSpeakEvent(event)

    impl = jpype.JProxy(
        "ch.dubernet.demopythonapi.simulation.api.ProtobufEventHandler",
        inst=ProtobufHandler())
    return ProtobufAdapter(impl)

We just add one level of proxying, implementing the new Java interface and forwarding to some python object if it has methods with the right name. Notice how we use message[:], which says “give me the slice of the array that contains all elements. What this does is to return a numpy array containing the data1 whereas message is only a JArray wrapper type that python does not understand. From my understanding, it is very efficient because it can access the same memory mapped buffer as the Java counterpart, and so no copying of any sort is necessary.

We now have all we need, we can run it! Omitting the imports, this is:

simulation = Simulation(100, 100)

class PythonPbBenchmarkHandler:    
    def handleJumpEvent(self, event):
            event.agentId
            event.height_m
            event.time
    
    def handleSingEvent(self, event):
            event.agentId
            event.song
            event.time
            
    def handleSpeakEvent(self, event):
            event.agentId
            event.text
            event.time


simulation.getEvents().addEventHandler(
        create_event_handler(
            PythonPbBenchmarkHandler()))

total_time_s = 0

for i in range(100):
    total_time_s -= time.time()
    simulation.run()
    total_time_s += time.time()
    
print(total_time / 100)

I start to like it, it is getting more “pythonic” by hiding the JPype part. The output is:

0.26328364610671995

Roughly twice as fast as pure JPype. Better, but still not great.

How does Jnius fare here? We can use the same kind of approach:

def create_event_handler(handler):
    class ProtobufHandler(jnius.PythonJavaClass):
        __javainterfaces__ = [
            'ch/dubernet/demopythonapi.simulation.api.ProtobufEventHandler'
        ]

        @jnius.java_method('()V')
        def notifyStart(self):
            if hasattr(handler, "notifyStart"):
                handler.notifyStart()

        @jnius.java_method('()V')
        def notifyEnd(self):
            if hasattr(handler, "notifyEnd"):
                handler.notifyEnd()

        @jnius.java_method('([B)V')
        def handleJumpEventMessage(self, message):
            if hasattr(handler, "handleJumpEvent"):
                event = JumpEvent()
                event.ParseFromString(bytearray(message))
                handler.handleJumpEvent(event)

        @jnius.java_method('([B)V')
        def handleSingEventMessage(self, message):
            if hasattr(handler, "handleSingEvent"):
                event = SingEvent()
                event.ParseFromString(bytearray(message))
                handler.handleSingEvent(event)

        @jnius.java_method('([B)V')
        def handleSpeakEventMessage(self, message):
            if hasattr(handler, "handleSpeakEvent"):
                event = SpeakEvent()
                event.ParseFromString(bytearray(message))
                handler.handleSpeakEvent(event)

    impl = ProtobufHandler()
    adapter = ProtobufAdapter(impl)
    # This is a hack to solve https://github.com/kivy/pyjnius/issues/59
    # Basically, one needs to keep a reference to impl in python,
    # to avoid the GC removing it from below our feet.
    # This is an illustration of how hard this inter-VM communication can be...
    adapter._back_ref = impl
    return adapter

Notice how we now use message: jnius gives us python lists, and does not support numpy arrays yet. Also notice the lines before the return statement: you need to keep a reference in python to the java objects you create, or the GC will delete them… We do this by creating an ad-hoc python field on the Java wrapper.

Which results in a similar run script, with the difference that one needs to make sure to keep python references to the Java “adapter” object:

handler = PythonPbBenchmarkHandler()
adapter = api.demojnius.create_event_handler(handler)
simulation.getEvents().addEventHandler(adapter)

The output is:

1.0965033042430878

This is… no improvement. This probably has to do with using python lists. No matter how much I tried to hack (for instance trying to pass the data as a String), I did not manage to solve that problem. Combined with the difficulties I had to get it to work (in particular those GC issues), this leads me to give up on Jnius. Sorry, Jnius.

Buffering the Protocol Buffers

OK, so Jnius is out of the race, and JPype is still not awesomely fast. Can we get better? Remember we cannot really get rid of conversion costs, but can try to minimize communication costs further. The idea is to send events not one by one, but first buffer them and send them only when the buffer is full, making sure we send whatever remains in the buffer at the end of the simulation. This buffer itself will be expressed in protocol buffer format:

syntax = "proto3";

option java_package = "ch.dubernet.demopythonapi.simulation.protobuf";

import "protobuf/JumpEvent.proto";
import "protobuf/SingEvent.proto";
import "protobuf/SpeakEvent.proto";

message EventBuffer {
    repeated EventContainer events = 1;
}

/**
 * Acts as a kind of case class to be able to build the buffer
 */
message EventContainer {
    oneof event {
        JumpEvent jumpEvent = 1;
        SingEvent singEvent = 2;
        SpeakEvent speakEvent = 3;
    }
}

The code is really the same, simply using that new protocol buffer format, so I will omit it here. With a buffer big enough to fit all the events, the output becomes:

0.03600198030471802

This is now “only” one order of magnitude slower as pure Java. This is not as bad as it sounds, as both our simulation and “analysis” are much more lightweight than what you would have in a real application.

The question is: how big should the buffer be? The answer is actually “not big at all”:

Scaling of runtime with increasing buffer size

Runtime decreases exponentially with buffer size, and after about 20 events in the buffer, gains can be neglected.

TL;DR

JPype and Jnius are two great libraries to design Python APIs for Java programs, but they come with some overhead, that can become a problem when one implements Java interfaces that get called a lot.

Though Jnius’ interface can be nicer, it does have some issues with avoiding garbage collections of objects still in use, and its performance cost leads me to prefer JPype for such applications. JPype’s numpy integration works well in combination with Protocol Buffers, significantly decreasing performance overhead.

What Next?

This was only a quick experiment, and some additional work would be needed to create a full-fledge Python API.

I already started applying those concepts to designing a Python API for MATSim, which I hope to have ready for Fall Semester 2019. Stay tuned!

Do you feel this approach is useful? Give your opinion in the comments!

  1. This requires JPype to be compiled with numpy support. Apparently, this is not the case when installing from PyPi, at the time of writing, but this could be solved by installing directly from GitHub with pip install git+https://github.com/jpype-project/jpype@master, making sure numpy was installed before. You can follow the current state of this issue here

Comment

Want to react? Send me an e-mail or use Webmentions

Webmentions

No webmentions were found.