Working in the Software-as-a-Service space means that you’re dealing with loads of logging and diagnostic information being generated every second. Access logs to webservices, audit logs for administrative access, API calls, and automated notification systems such as your NMS (Network Monitoring System) are spread throughout the environment. When we talk privacy, we often don’t consider these operational systems as they are removed from the client facing side and don’t contain sensitive data (on the surface). They often do though, and we need to be cognizant of the fact that identifiers in these logs can be Personal Information. Here are a few pitfalls that you may find within a software or SaaS environment.
Web/Application Server Logs
Whether it’s Apache, Tomcat, Internet Information Services, nginx or almost any other flavour of web server, you’re going to have access logs attached to it. These logs are used for diagnostics and to see who has accessed what, at what time. By default, these logs include a number of pieces of information, most notably an IP address for the client that is connecting. These client IP addresses are important to note as they are seen as Personally Identifiable Information (PII).
Part of the definition of PII is that of information that can be used to reasonably ascertain an individual, and an IP address, machine identifier, and details in a cookie could be combined to give a degree of certainty as to someone’s identity. In the EU even dynamic client IP addresses are seen to be personal, as the ISP can be commissioned to identify the user assigned to a particular IP at a particular time.
What can you do?
Use a method to obfuscate the IP addresses to a degree. On some of my hosted services, I do not record the last 3 digits of an IP address. This means that I can still identify a single connection, but I have no way of identifying who the connected party is. In addition to this, configure your log files to only log what is necessary for your purposes, and be clear on that in your privacy policies.
If you are passing information to an API, or logging API requests where information is passed within the URL, you may be logging personal information. Having an identifier such as an email address, username, social security / identity number or otherwise passed within the URL is a bad idea. These entries are logged in web logs as above, but also potentially in your software and routing equipment. Similarly, these would also be logged on the client side and via the network devices on the way to your endpoint location. In other words, it’s just a really bad idea.
What can you do?
Build security into your APIs, and deliver them over secure methods such as HTTPS. Avoid any inputs or outputs of sensitive data being used in the URL, and instead keep these within the message payload (or avoid using them entirely).
As part of your security safeguards in the business, you should be recording and auditing administrative access to your servers. If you have cyber insurance, it’s likely that you are required to keep 6 months’ worth of logs of this nature. While these logs won’t have information about your clients and end-users, they will contain identifiable information about your employees.
Should there be a Data Subject Access Request (DSAR) from an ex-employee, you will need to be aware of this data and what your approach would be to honour/deny a request for deletion or similar.
What can you do?
Have a defined retention period for these logs and ensure that you are sticking to it. Be aware of where logs move, and if someone is archiving your logs, make sure that they are handled in a way that represents the sensitive information they contain.
Automated Logging Alerts
For the most part, logging alerts will be simplified and not have any PII in them to begin with. That is until someone changes the log level from INFO to DEBUG (or similar). Suddenly, a whole host of additional information is logged, even in brief moments. This information, when fed into your monitoring system or even something as simple as an email, means that personal information is now potentially being shared across the network.
As an example, if we look at my router’s logs we can see that Debug mode gives a whole lot more information, including my user ID when accessing my internet connection – information that is considered personal.
What can you do?
If the debug logging is coming from your own software, ensure that only the information you need to effectively diagnose an issue would end up in a debug log. If PII is not necessary, don’t include it. If you don’t have control over what is included, then ensure that all your services are logging at the correct level (most likely INFO). Lastly, if there is PII in your INFO logs, I would seriously reconsider how you are logging!
This last one is perhaps a bit more obvious than the others, but if you’re using packet sniffing as a diagnostic tool (Wireshark, Fiddler or similar), you’re going to collect some personal information in these logs – at the very least IP addresses. If the logs are shared, you need to define who should have access and limit it. The last thing you need are rogue logs lying around the network with personal information in them. If you are sniffing an insecure protocol, you’re likely to be recording everything from usernames and passwords down to the information that people submit in your system – an incredibly invasive position to be in!
What can you do?
Have defined retention periods and responsibilities within your supporting teams. They need to be aware of the sensitivity of these logs and handle the files accordingly. Information should be securely deleted when no longer needed, or if archiving is required, the logs should be anonymised so that PII is no longer usable.
As a tech or software company, all of the above items are likely present within your environment whether you know it or not. It is key that you are aware of these potential blindspots and that they are addressed as part of your data protection / privacy programme. Should you need assistance in identifying your own organisation’s concern areas, reach out to Ross for a no-obligation enquiry call.
Ross G Saunders Consulting is a niche data protection consultancy, working with a number of professional partners in order to help you as a business comply with data protection regulation. We help with business process, compliance, documentation and more, and can offer a full range of services to take the hassle out of data protection. Why not reach out to find out how we can help you gain a competitive advantage while simultaneously garnering support from your existing and potential customers.