An important thing to keep in mind here is that when working on network applications, the more important performance metric is "latency-per-task" and not the raw cpu cycle throughput of the entire application. To that end, thread message queues tend to be a very good method for responding to activity in the quickest possible fashion.
80k messages per second on today's server infrastructure (or even my Core i3 laptop) is bordering on being insignificant territory -- especially insofar as L1 cache performance is concerned. If the threads are doing a significant amount of work, then its not unreasonable at all to expect the CPU to flush through the L1 cache every time a message is processed, and if the messages are not doing very much work at all, then it just doesn't matter because its probably going to be less than 1% of the CPU load regardless of L1 policy.
At that rate of messaging I would recommend a passive threading model, eg. one where threads are woken up to handle messages and then fall back asleep. That will give you the best latency-vs-performance model. Eg, its not the most performance-efficient method but it will be the best at responding quickly to network requests (which is usually what you want to favor when doing network programming).
On today's architectures (2.8ghz, 4+ cores), I wouldn't even begin to worry about raw performance unless I expected to be handling maybe 1 million queued messages per second. And even then, it'd depend a bit on exactly how much Real Work the messages are expected to perform. It it isn't expected to do much more than prep and send some packets, then 1 mil is definitely conservative.
Is there a way in the above design model, where the library thread and process thread can share the same core, and consequently share a cache line?
No. I mean, sure there is if you want to roll your own Operating System. But if you want to run in a multitasking environment with the expectation of sharing the CPU with other tasks, then "No." And locking threads to cores is something that is very likely to hurt your threads' avg response times, without providing much in the way of better performance. (and any performance gain would be subject to the system being used exclusively for your software and would probably evaporate on a system running multiple tasks)
Given a high frequency scenario, what is a light-weight way to share information between two threads?
Message queues. :)
Seriously. I don't mean to sound silly, but that's what message queues are. They share information between two threads and they're typically light-weight about it. If you want to reduce context switches, only signal the worker to drain the queue after some number of messages have accumulated (or some timeout period, in case of low activity) -- but be weary that will increase your program's response time/latency.