The implementation is actually pretty straightforward. I've been trying to think of a way to implement such a thing in a completely lock-free manner for a couple days, and I haven't been able to come up with a solution. However, with the restriction of only a single writer, that makes things much easier.
The implementation is that of a circular doubly-linked list, but there are a couple tricks to making it work. First, readers only traverse the list in one direction: forward. That means that only the forward links and the head pointer need be kept valid at all times. In fact, I don't even declare the backward pointer in the nodes as volatile.
That brings us down to two variables that must be kept in sync: the forward links and the head pointer. However, with some careful instruction ordering, we find that we don't need to keep the head pointer coherent with the forward links 100% of the time (though the forward links must always be coherent).
The trick is to know where the linearization points are - the points where changes atomically take effect (atomic from the perspective of the readers). In the cases of both the insertion and deletion, this point occurs when we set the previous node's forward pointer to a new location (either the node we just allocated, in the former scenario, or the node after the one we're deleting). So long as this step is performed atomically, the change will be seamless to the readers. In the case of insertion, we have to set up the new node before doing this, of course. We have to, at the very minimum, set the new node's forward pointer to the node we're inserting before. Once that's done, we can link to the new node, and the chain will always be kept coherent in the forward direction.
That just leaves the matter of the head pointer. To save a bit of speed, new nodes are inserted right after the head node. That means we won't need to change the head pointer for insertions (unless the list was empty, in which case it's a trivial matter of setting the head pointer to the new node after we've made the node's links point to itself).
For deletions, however, there's always the chance that the head pointer will be pointing at the node we want to delete. But this is okay. It's okay for the head pointer to point to the node we're deleting, until we actually free the node's memory (and even then we use hazard pointers, in case some thread is still holding a pointer to it). In this case the linearization point just occurs slightly later: when the head pointer gets changed to point to the node after the deleted node. If a reader tries to read the node we’re deleting and then advances the head, it will just move the head to point to the node after the node we’re deleting, which we were going to do, anyway. At no time does a garbage pointer exist anywhere; thus it's fully thread-safe.
So, those are the solutions for the "hard parts". The rest is just vanilla hazard pointers, atomic compare/exchanges, and memory barriers.