So, in this talk I'm going to tell you the story behind SQLancer, which is a tool that we designed and developed to automatically test database management systems. Now I assume that many of you are not directly working on developing database management systems, which is why I will focus on general insights that you might be able to apply on testing your project. So just to ensure that we're all on the same page - in our work we wanted to test database management systems that expect queries or statements written in the structured query language or for short SQL. As perhaps most of you know, we can for example use this language to create tables, to insert data into them, and then again fetch data using a select statement. Now in the previous talks we already heard a bit about testing, so how can we write test cases that could test such systems? Well, one approach would be to just write them in SQL. We could for example specify the test case, such as here taken from the MySQL test to it, and then in a separate file specify the expected output. And while there are some frameworks that make this process a bit easier, I would argue that it's still always challenging and time consuming to write manual test cases, especially considering for such complex systems where we need a huge test suites. So in our work we ask ourselves, can we automate the test process? And this brings us to two challenges. First of all, if we want to have an automated testing approach, we need a test case, and in our case this means that we need to generate a database and a query that we can then validate. So looking into what's already out there, we can find tools such as for example SQLsmith. SQLsmith is an effective and widely used brand new query generation tool that mutates and generates complex SQL statements that might cause for example a crash in the database system that developers can then fix. So I would say that this is already a solved challenge, so we can use either this random generation approach or another one to generate the test cases. Now what was still an open challenge is the so-called test oracle problem, namely we want to validate the query's result. So what's a test oracle? A test oracle is basically the mechanism that determines whether a test has passed or failed, and for regression testing for writing manual test cases, we typically as the developer or developers are - or provided the test oracle. But this is often difficult, even for manually written test cases, and I want to illustrate this based on a concrete example. You can see here a test case that actually allowed us to find and report a bug in MySQL. You can see here that we have two tables t0 and t1, each of which holds a floating point value, namely a positive zero and a negative zero. And then we instruct MySQL to fetch the cross product of all records from these two tables where the column evaluates to the same value. Now what's the expected output here? I would argue that it's disputable whether the predicate here should evaluate to true or false because we could for example look at the binary representation of these two floating point numbers to realize that the sign bit differs. And based on this it would make sense that the equality operator evaluates to false and MySQL for this case would return an empty result. But most programming languages such as Java, C, and so on, they actually define the semantics that this equality operator should evaluate to true for - for these values. And in this case MySQL would fetch a row. So at this point in a talk you have to trust me, actually MySQL is expected to fetch the record. But when we tested the latest version of it, it still failed to do so. We reported this as a bug to the developers who then fixed it for the next release of MySQL. But the point here is actually that we could find this bug without having an idea on whether a predicate should evaluate to true or not and whether any results or any records should be fetched. We basically had this test oracle that told us that the result returned was correct. Now we actually developed a couple of test oracles, the one I'm going to present today is termed logic partitioning or short TLB. And if I had to explain the approach in half a sentence I would basically say, the idea is to test the database management system against itself. The idea is quite generous I want to explain it based on an analogy. So if we ever meet in person it will it's very likely that this will be in the coffee kitchen because I like to drink lots of coffee. And let's also assume that there is a bowl with fruits containing both tangerines and also clementines. Now I'm actually very bad at telling these different citrus fruits apart, they all look the same to me, so this is what I tell you in order to start some small talk. And you might actually respond that you can clearly tell the difference. So I challenge you to show me, but how can I test your understanding of tangerines versus clementines without even having this understanding myself? Well the strategy that I would apply would be the following.First I would ask you to please bring me all the clementines and you would go ahead and fetch perhaps two of the fruits. I would then put them back into the bowl, shuffle them around, and then ask you to bring me all foods that are not clementines. You would bring the apple which I could recognize, along with the other orange-looking fruits. And I believe that already some of you see where this is going because you brought me two fruits first and then you brought me four fruits, so overall six fruits, but there are only five fruits in the bowl, meaning that you likely classified a fruit as being both a tangerine and a clementine. So the high level insight here is that every object in the universe and also in the bowl is either a clementine or not a climate and we use this insight to basically partition all the objects in the bowl. Of course you could always say - you could always consistently classify tangerines as clementines and the other way around, meaning that we cannot detect all the bugs, but neither can test - test cases, so this is a limitation that we have to live with. How do we now apply this to SQL? Now we have any kind of given predicate P and a row R rather than a fruit, but exactly one of the following must hold the predicate evaluates to true, not the predicate or is the true meaning that it evaluates to false, and then since in SQL we also have to deal with null values, we have to deal with the case where P might evaluate to null. And based on this we can now take any kind of data in our table - any kind of intermediate result - and partition it into three parts: then we do those for the predicate where it's true, those where it evaluates to false, and those where it evaluates to null. How did this now allow us to detect the bug from the motivating example? Well, we first generated this query that basically corresponds to counting the number of fruits in the bowl. So we simply fetch the cross product of all values from these two tables. And there MySQL returned a single record. Then we generated this more complex three queries based on this random predicate P. You can see here that we have the non-negated version, the negated version, and this null version. Overall we expect that this should compute the same result. However MySQL returned an empty result set here, which allowed us to find and report the bug. The technique is actually quite versatile in the sense that we can test multiple different kind of SQL features, here only focused on where clauses. So we basically had this random generation approach to tackle this test case generation problem, and now we propose ternary logic partitioning to tackle this test oracle problem. Can such a simple technique be effective is the next natural question? well I used SQL answer TLB being one of these approaches that we proposed to find over 450 unique and previously unknown bugs in Baidu's database management systems that you have all heard of or know. And what should you take away from this talk? Well, while the specific technique data partitioning primarily works for data-oriented systems, it is based actually on a more generic - more general kind of technique. Namely if we look at this from an abstract level, we basically can realize that we had a test case that we executed to obtain a result and based on this we derived a new test case for which we could then provide a test oracle. And this does metamorphic testing. Now what you can do is, you could try of course to come up with a metamorphic testing technique for yourself but an alternative would also be to check if there are any already existing approaches, for example by looking on Google Scholar. And with that, I'll summarize. And the takeaways, so manually writing test cases is time intensive and requires detailed domain knowledge and you might actually face a similar problem in your work. So what we propose is to couple random test case generation with metamorphic testing which turned out to effectively complement manually written test cases. With that, thank you for listening.