There are plenty of free online resources available that can be used to learn how to interact with databases using SQL. Places like sqlcourse.com and sqlzoo.net tell students all about the effective use of statements like “select”, “update”, “join” and “group by”. The best part is that once you start learning, there is not all that much complexity about it – all of these materials can be processed in less than a day.
However, while these “general purpose” courses cover all the basic operations, they fail to mention why we as testers should take the time to learn SQL in the first place – and how our needs differ from those of our developers. This is what the rest of this article is about – to describe how taking control of the database has helped improve my day-to-day testing and why I think it can enhance everyday testing work in various contexts.
Modelling and thinking about your application: As testers, we rely on written and mental models to come up with test ideas and to think of our software in various dimensions. So most importantly for me, approaching the product under test from the database perspective allows you to model the application in a new way; thus supplementing any existing thought patterns.
Going over a database model is a great start, but I think that having hands-on experience gives me a better understanding of how specific features are implemented, as well as how the application as a whole is put together. This can be a tremendous asset to finding interesting bugs in upcoming features – even outside the scope of SQL. In addition, you might be able to anticipate possible failure scenarios of new functionality based on how they will interact with the database. By letting mind wander around database interactions, I have found showstopper bugs while drinking coffee on my balcony.
Fun fact: modern development frameworks create a lot of boilerplate database structures and interactions automatically, so developers can sometimes become quite distant from the actual internals of the database. As a tester, you might get into a position where nobody knows the ins and outs of the database as well as you do!
Visibility: After performing an action via the front end or API, you may need to check whether something was actually stored or updated in the database (and whether everything else was left intact). You can also monitor internal states of the application that are otherwise hidden.
Controllability: With Insert and Update rights, you can perform a specific change on the application’s data that might otherwise be hard to trigger, such as time related functionality. As an example, when testing recurring billing functionality, you could edit relevant date fields to test business flows that would normally have you waiting for weeks. Additionally, you can perform massive changes with a single query – such as replacing all emails, phone numbers and passwords with new test data. As an example, I have tested various systems that demanded unique and valid email addresses and phone numbers as part of account creation. I could create accounts from the front end – using my actual email and phone number – then replace them with dummy data using ‘update’ queries.
New challenges: Some features might exist entirely at the database layer, thus being really difficult to test in an end-to-end scenario. When there are no testers around who have the required skills to investigate those features, developers might have to test them instead. Even worse, such “untestable” features might be pushed directly to production. As an example, I have tested message queues that were only visible at the database layer, but had very specific requirements for the priority in which messages of various types were to be sent. Using queries that counted the amount of messages in the queue grouped by the message type, I was able to observe in real time how the prioritization behaved.
Get to know your data: Depending on your context, access to a (obfuscated) copy of production data could allow you to gauge the impact a bug would have on the customers. As an example, I once encountered a serious bug that was deemed extremely difficult to fix. However, it only affected specific types of legacy accounts under specific circumstances. Querying the database revealed that only an extremely small amount of actual users were using this type of account, changing the priority of the bug significantly.
In addition, database queries help you find realistic data to test with. Testing with self-generated test data is great to show that software ‘can work’, but for adequate coverage, it usually makes sense to make the test environment as close to production as possible – and a big part of that is having realistic, messy test data. As an example, I once encountered a strange ‘internal server error’ bug that could only be replicated on some production accounts, but never with test data. The root cause turned out to be the specific way in which some accounts were created in previous versions of the product.
Building trust: Last but not least, database access is a ‘key to the castle’, as described in Ioana Serban’s CAST2015 talk (database access is specifically discussed at the 35 minute mark). Working in an outsourced testing company, I am used to ‘asking for keys to the castle’ whenever a new project starts. Getting an increased level of access and showing how much this helps the testing effort – without breaking anything – can be used to increase mutual trust and respect within the project team.
Generate test data: If you need a thousand accounts for performance testing, it might be ok to use database queries for that – in so far as you can be sure that accounts created directly in the database do not differ from normal accounts in a performance significant way.
However, I urge you to be careful with query generated accounts for functional testing – there might be minute details – such as encoding or input field length limitations – that cause your test account (or any testdata really) to act differently from an account that was created by a real user.
As an example, I once tested an application that required measurements for every hour of every day for a full year – that is 8760 data points every time I wanted to test something. Using insert statements from a spreadsheet, I could add the exact test data I wanted in seconds.
As a closing note, I recommended that the fastest way to learn SQL is to try out statements on whatever database driven application you are testing at the moment. Working with familiar data makes learning much faster, since you will already know what most of the tables are used for. Ideally, I would get in touch with developers who interact with the database as well – they can be a great help in making queries return exactly the data that you want to see. In time, this will work the other way around too – you can be asked to review or test database statements affecting production data.
This is the end of my list for now. You are welcome to post a comment if you disagree with something or want to add your own ideas!