== Some things on Django's CSRF protection, sessions, and ((REMOTE_USER)) We have a [[Django application DjangoORMDesignPuzzleII]] where we've had [[mysterious CSRF problems Django111CSRFFailures]] in the past, which I've theorized was partly because we use it behind [[Apache HTTP Basic Authentication ../web/ApacheBasicAuthWhy]]. As part of recovering [[my understand of Django and Apache HTTP Basic Authentication DjangoApacheBasicAuth]], I've been digging into how Django's CSRF protection works and how it interacts with all of this. Our starting point is Django's documentation on [[Cross Site Request Forgery protection https://docs.djangoproject.com/en/4.0/ref/csrf/]]. [[How it works https://docs.djangoproject.com/en/4.0/ref/csrf/#how-it-works]] is that Django sets a CSRF cookie and then embeds a hidden form field; on form submission, the two pieces of information must be present and match ([[everyone does something like this ../web/CSRFCookieRequirement]]). The CSRF cookie and the form field are both derived from a shared secret to protect from [[BREACH attacks https://breachattack.com/]]. The important thing about this shared secret in some situations is, well, let me quote the documentation: > For security reasons, ~~the value of the secret is changed each time a > user logs in~~. In a Django environment with normal authentication, it's clear when a user logs in; it's when they go through the Django login process, providing Django with a clear moment to establish an authenticated session, rotate secrets, and so on. In an environment where Django is instead relying on external authentication via ``REMOTE_USER'', it's not so clear. [[The documentation https://docs.djangoproject.com/en/4.0/howto/auth-remote-user/]] says only that RemoteUserMiddleware will detect the username to authenticate and auto-login that user. The answer to this turns out to involve [[Django sessions https://docs.djangoproject.com/en/4.0/topics/http/sessions/]]. When you have sessions enabled in Django, which you normally do, all requests have an associated session (visible in _request.session_). To simplify, important sessions are identified and tracked by browser cookies, with one created on the fly if necessary (along with a new session). A session may be anonymous or [[may be for an authenticated user https://docs.djangoproject.com/en/4.0/topics/auth/default/#auth-web-requests]]. If the session object for the current request lacks an authenticated user but the request has a ``REMOTE_USER'', RemoteUserMiddleware 'logs in' the indicated user, which will rotate the CSRF secret. (I'm not sure how Django handles CSRF secrets for anonymous, unauthenticated people. Some versions appear to set the CSRF browser cookie without any session cookie.) In the default Django configuration, this creates an important split between when you think you've logged in and when Django thinks you've logged in. You think you're logging in any time you have to enter your login and password for HTTP Basic Authentication (which is normally only once, until you quit the browser). However, Django only thinks you're logging in if your session is unauthenticated, and the session cookie Django sets in your browser normally lasts for two weeks ([[cf https://docs.djangoproject.com/en/4.0/ref/settings/#std:setting-SESSION_COOKIE_AGE]]). Before then you can quit your browser, start it up again, re-do HTTP Basic Authentication, and not log in from Django's perspective because your session is still fine. Equally, you can keep your browser running and authenticated for more than two weeks, at which point your session cookie will expire and Django will consider you to be logging back in again (with a CSRF secret rotation) even though you were never challenged for a password. (If you [[use the relevant setting https://docs.djangoproject.com/en/4.0/ref/settings/#std:setting-SESSION_EXPIRE_AT_BROWSER_CLOSE]] to tell Django to use a browser session cookie to identify the Django session, you at least more or less synchronize Django's view of you logging in with your view of it.) The other wrinkle is that if RemoteUserMiddleware sees an authenticated session for a request without ``REMOTE_USER'' set, it logs the session out. This is [[half-documented by implication https://docs.djangoproject.com/en/4.0/howto/auth-remote-user/#using-remote-user-on-login-pages-only]], but you have to remember (or know) that 'all authenticated requests' means 'all requests with a session that thinks it's authenticated' (and the documentation doesn't actually say that your session gets logged out). This matters if part of your application is generally accessible (for anyone to submit an account request) while part of it is protected by HTTP Basic Authentication (for authorized people to approve those requests for accounts). Suppose that you go to approve an account request, which involves a CSRF protected form, but then pause and in another window go look at the unprotected account request submission page. You're now invisibly logged out, and when you submit the form in your first window, you will be logged back in, which triggers CSRF secret rotation, which invalidates the CSRF secret that underlies both the cookie and the form you just submitted. To get around this, I think you want to use PersistentRemoteUserMiddleware instead. Or tell people not to do this. (Much or all of this goes back at least to Django 1.10 and I don't think it changed between 1.10 and 1.11, so all of this still doesn't really explain [[our CSRF issue in 1.11 Django111CSRFFailures]]. But at least I can now probably make problems much less likely in any version of Django.) PS: One thing that [[the sessions documentation https://docs.djangoproject.com/en/4.0/topics/http/sessions/]] tells you that I didn't previously know is that in the default configuration where sessions are saved in your database, [[you need to clear old expired ones out of it periodically https://docs.djangoproject.com/en/4.0/topics/http/sessions/#clearing-the-session-store]] with '_django-admin clearsessions_'. We hadn't been doing that, and so had entries for ones going back to 2016. The saving grace is that I don't think sessions get written to the database until they really have something in them, like an authenticated user; otherwise we'd have a lot more of them in the database than we do.