Or “Why are all my processes waiting on memory. There’s tonnes of memory”
It’s probably not new news that there was a fairly nasty memory-related bug in SQL 2005 RTM and SP1 that was related to the relaxing of limits on cache size. Specifically the TokenAndPermUserStore cache.
On systems with large amounts of memory (20GB+) and frequent ad-hoc queries or significant usage of dynamic SQL, the cache can grow quite large, and by quite large I’m talking upwards of 2GB. I think I saw the cache at close on 8GB at one time on one of my servers.
The problem with this is that is takes quite a bit of time to search through several GB of cache to find the required tokens. Making matters worse, access to that cache is synchronised, so only a single thread may have access at a time.
The main symptom of that problem is lots of CMEMTHREAD waits without an apparent wait resource and a higher than normal CPU usage.
But that problem was fixed in SP2 with a change to the caching behaviour. Right?
Well, not really.
While investigating memory-related waits on one of my servers, I ran into a few blog entries that seemed to suggest that the caching bug was not fixed in SP2, not completely anyway.
Troubleshooting Performance issues with TokenAndPermUserStore in SQL2005 x64
More TokenAndPermUserStore
TokenAndPermUserStore continued
Wesley’s blog point to an entry in the list of fixed included in the 4th cumulative update that related to the token cache. Interesting that there was no KB article for it.
To sum things up. If you’ve got a SQL 2005 server with lots of memory, you’re running a build before 3179 and you’re seeing lots of CMEMTHREAD waits, you may be running into this bug. If flushing the cache (DBCC FREESYSTEMCACHE (‘TokenAndPermUserStore’)) makes the problems go away, then it’s almost certain. Get hold of the latest cumulative rollup and see if it helps.
p.s. There has been at least one report that the problem persists even in the latest build. If I find more on that, I’ll post it.
This issue seems to be significantly better with Build 3186 than it was in Build 3161, but I still see TokenAndUserPermStore bloat on one of my production servers. I just have an Agent job that runs DBCC FREESYSTEMCACHE(‘TokenAndPermUserStore’) once a day.
Thanks. That’s good to know. We’re on SP2 and ran slap-bang into this problem on Friday. We’re currently looking at getting the latest build
For now, I’ve got a job runs every 15 min and clears the cache if it exceeds 750MB. I get a flush about every 2-3 hours
Pingback: SQL in the Wild » Blog Archive » Token and perm user store growth
Pingback: March Madness - SQL Azure - Wait Events | SQLRockstar | Thomas LaRock