First, you probably noted that it’s mostly a bonus, start the mode 3 code when everything else is finished.
We already saw in the first slides how the rx_rings of the e1000 NIC work. Normally you also now the main receive function : e1000_clean_rx_irq.
There is one struct e1000_rx_ring per hardware RX ring. Normaly, there is only one ring (and you should only care about one for step 7).
I think the fields are pretty well defined (for once…) :
191 struct e1000_rx_ring { 192 /* pointer to the descriptor ring memory */ 193 void *desc; 194 /* physical address of the descriptor ring */ 195 dma_addr_t dma; 196 /* length of descriptor ring in bytes */ 197 unsigned int size; 198 /* number of descriptors in the ring */ 199 unsigned int count; 200 /* next descriptor to associate a buffer with */ 201 unsigned int next_to_use; 202 /* next descriptor to check for DD status bit */ 203 unsigned int next_to_clean; 204 /* array of buffer information structs */ 205 struct e1000_rx_buffer *buffer_info; 206 struct sk_buff *rx_skb_top; 207 208 /* cpu for rx queue */ 209 int cpu; 210 211 u16 rdh; 212 u16 rdt; 213 };
The desc is a pointer to the memory zone, containing the ring. This memory zone is accessible by the NIC itself through the dma address. The ring is a contiguous zone of e1000_rx_desc structures. Usually you access it with
E1000_RX_DESC(R, i)
where R is the ring pointer and i the index of the descriptor.
One descriptor i composed of :
522 struct e1000_rx_desc { 523 __le64 buffer_addr; /* Address of the descriptor's data buffer */ 524 __le16 length; /* Length of data DMAed into data buffer */ 525 __le16 csum; /* Packet checksum */ 526 u8 status; /* Descriptor status */ 527 u8 errors; /* Descriptor Errors */ 528 __le16 special; 529 };
The buffer_addr variable is a pointer to the physical memory address where a packet can be received. The problem is that the Kernel normal put hte address of a skbuff->data there. But when a packet is received and its content is copied inside that buffer, how to get back the corresponding skbuff? This is why the e1000_rx_ring structure has also a buffer_info pointer. There is exactly as many buffer_info than e1000_rx_desc. The buffer info contains all the software-only information that we need to keep about each descriptors, such as the skbuff which will receive the data pointed by buffer_addr.
Before receiving any packets, all buffer_addr have to be setted ! Or the NIC wouldn’t know where to put the data. This is done in e1000_configure :
407 for (i = 0; i < adapter->num_rx_queues; i++) { 408 struct e1000_rx_ring *ring = &adapter->rx_ring[i]; 409 adapter->alloc_rx_buf(adapter, ring, 410 E1000_DESC_UNUSED(ring)); 411 }
alloc_rx_buf is a pointer to the function e1000_alloc_rx_buffers (if you don’t use jumbo frame, and you shouldn’t here). We see that this functions is called for all rx rings.
The function e1000_alloc_rx_buffers is defined at line 4561. It calls “e1000_alloc_frag” for each buffers between rx_ring->next_to_use (initialized to 0) up to cleaned_count (in configure, the size of the ring is passed, so this is equivalent to a full reset).
The memory obtained from e1000_alloc_frag is (probably) not accessible by hardware, so we have to map “DMA map” it :
4613 buffer_info->dma = dma_map_single(&pdev->dev, 4614 data, 4615 adapter->rx_buffer_len, 4616 DMA_FROM_DEVICE);
You will have to do this with your fastnet buffers !
Then the result in dma is putted inside buffer_addr (line 4650).
When some packet are received, e1000_clean_rx_irq will make skbuff out of them, and we’ll need to allocate new buffers and put pointers to them inside the ring so next packet can be received. This is done at the end of e1000_clean_rx_irq :
4481 cleaned_count = E1000_DESC_UNUSED(rx_ring); 4482 if (cleaned_count) 4483 adapter->alloc_rx_buf(adapter, rx_ring, cleaned_count);
So when the device goes in mode 3 you have to :
- Change the adapter->alloc_rx_buf function per your new one which will set buffer_addr to dma mapped fastnet buffers.
- Do a full pass of a new alloc_rx_buf to replace all current skb buffers by fastnet buffers so all new packets will be received directly in the fastnet zone.
When a packet is received :
- In fastnet mode 3, do not call any skb related function in clean_rx_irq : from line 4385 to 4401 which creates the skb, line 4424, line 4438, and 4453 to 4463.
- Instead of line 4464 calling the kernel receive path, do the copy of the packet length to the fastnet buffer descriptor. Note that as for skb in buffer_info, you will need to keep a pointer to the related fastnet descriptor. There is no need to call the kernel receive path anymore : the whole purpose of mode 3 is to directly receive the packet in userspace. Just set the length so the user knows there is a packet available inside the corresponding buffer !
What I want with a “generic way” is that the least possible code has to be written inside e1000, so do not implement any fastnet descriptor related code in e1000 : create a function in fastnet.c to get the next fastnet buffer that you will map, and use net_dev_ops or adapter-> functions so that fastnet ioctl can call a generic function like dev->ops->set_in_fastnet_zc_mode that any driver may or may not implement.