uv_write_t *req is not the data to be sent or received. It's just something like a handle to a write request.
Neither is req->data. That is a pointer to arbitrary private data for you. It might be used for example if you wanted to pass around some data related to the connection.
The actual payload data are sent through a write buffer (uv_buf_t) and received into a buffer that is allocated when a read request is served. That's why the read function wants an alloc parameter. Later that buffer is passed to the read callback.
The freeing of req->data assumes that 'data' pointed to some private data, typically a structure, that was malloc'd (by you).
As a rule of thumb, a socket is represented by a uv_xxx_t while reading and writing use 'request' structures.
Writing a server (a typical uv use case) one doesn't know how many connections there will be, hence everything is allocated dynamically.
To make your life easier you might think in terms of pairs (open/close or start/done). So when accepting a new connection you start a cycle and allocate the client. When closing that connection you free it. When writing, you allocate the request as well as the payload data buffer. When done writing, you free them. When reading you allocate a read request and the payload data buffer is allocated behind the scene (through the alloc callback) when done reading (and having copied the payload data) you free them both.
There are ways to get the job done without all those malloc/free pairs (which aren't glourious performance wise) but for a novice I would agree with the uv docs; you should definitely start with the malloc/free route.
To give you an idea: I pre-allocate everything for some ten or hundred thousand connections but that brings some administration and trickery with it, e.g. faking in the alloc call back to merely assign one of your pre-allocated buffers.
If asked to guess I'd suggest that avoiding malloc/free is only worth the trouble beyond well over 5k - 10k connections at any point in time.