I believe that rather than a kernel bug, this is entirely possible if, after we receive an
event on the UDP socket, we somehow do not rearm it afterwards. I tested this out with a
small program:
<pre><font color="#4E9A06">char</font> buf[<font
color="#CC0000">1024</font>];
<font color="#4E9A06">int</font> main()
<span style="background-color:#34E2E2">{</span>
<font color="#4E9A06">struct</font> epoll_event ev,
events[MAX_EVENTS];
<font color="#4E9A06">int</font> udp_sock, nfds, epollfd;
<font color="#4E9A06">struct</font> sockaddr_in
server_addr;
<font color="#AF5F00">if</font> ((udp_sock = socket(AF_INET,
SOCK_DGRAM, <font color="#CC0000">0</font>)) == -<font
color="#CC0000">1</font>) {
perror(<font color="#CC0000">"udp_sock:
socket"</font>);
exit(<font color="#CC0000">EXIT_FAILURE</font>);
}
epollfd = epoll_create1(<font color="#CC0000">0</font>);
<font color="#AF5F00">if</font> (epollfd == -<font
color="#CC0000">1</font>) {
perror(<font
color="#CC0000">"epoll_create1"</font>);
exit(<font color="#CC0000">EXIT_FAILURE</font>);
}
server_addr.sin_family = AF_INET;
server_addr.sin_port = htons(<font
color="#CC0000">12345</font>);
server_addr.sin_addr.s_addr = inet_addr(<font
color="#CC0000">"127.0.0.1"</font>);
<font color="#AF5F00">if</font> (bind(udp_sock, (<font
color="#4E9A06">struct</font> sockaddr*)&server_addr,
<font
color="#AF5F00">sizeof</font>(server_addr)) < <font
color="#CC0000">0</font>) {
perror(<font
color="#CC0000">"bind"</font>);
exit(<font color="#CC0000">EXIT_FAILURE</font>);
}
ev.events = EPOLLONESHOT | EPOLLIN;
ev.data.fd = udp_sock;
<font color="#AF5F00">if</font> (epoll_ctl(epollfd,
EPOLL_CTL_ADD, udp_sock, &ev) == -<font
color="#CC0000">1</font>) {
perror(<font color="#CC0000">"epoll_ctl:
listen_sock"</font>);
exit(<font color="#CC0000">EXIT_FAILURE</font>);
}
<font color="#AF5F00">for</font> (;;) {
<font color="#4E9A06">int</font> n;
nfds = epoll_wait(epollfd, events, MAX_EVENTS, -<font
color="#CC0000">1</font>);
<font color="#AF5F00">if</font> (nfds == -<font
color="#CC0000">1</font>) {
perror(<font
color="#CC0000">"epoll_wait"</font>);
exit(<font
color="#CC0000">EXIT_FAILURE</font>);
}
<font color="#AF5F00">for</font> (n = <font
color="#CC0000">0</font>; n < nfds; ++n) {
<font color="#AF5F00">if</font>
(events[n].data.fd == udp_sock)
recv(udp_sock, buf, <font
color="#CC0000">1024</font>, <font
color="#CC0000">0</font>);
}
}
<font color="#AF5F00">return</font> <font
color="#CC0000">0</font>;
<span style="background-color:#34E2E2">}</span>
</pre>
So, here I am receiving an event at the udp socket and then I do not rearm it for the next
iteration. It appears that after the first message is received, the EPOLLONESHOT flag
remains but the rest of them are cleared. To wit,
<pre>[anirban@localhost test]$ cat /proc/6265/fdinfo/*
pos: 0
flags: 0100002
mnt_id: 23
pos: 0
flags: 0100002
mnt_id: 23
pos: 0
flags: 0100002
mnt_id: 23
pos: 0
flags: 02
mnt_id: 7
pos: 0
flags: 02
mnt_id: 10
tfd: 3 events: 40000019 data: 40071000000003
[anirban@localhost test]$ echo "test" | nc -u 127.0.0.1 12345 # Send a
message to the UDP socket
[anirban@localhost test]$
[anirban@localhost test]$ cat /proc/6265/fdinfo/*
pos: 0
flags: 0100002
mnt_id: 23
pos: 0
flags: 0100002
mnt_id: 23
pos: 0
flags: 0100002
mnt_id: 23
pos: 0
flags: 02
mnt_id: 7
pos: 0
flags: 02
mnt_id: 10
tfd: 3 events: 40000000 data: 40071000000003
[anirban@localhost test]$</pre>
So, unless we unhook and re-hook the udp socket after each event, we might have to figure
out why the re-arm is not happening. As an aside, there seems to be a major difference in
where the rearm happens between 2.5.3 and 2.7. In 2.5, it is explicitly called in the
nfs_rpc_dispatcher_thread, while in 2.7, it is only called from within ntirpc.
Thanks,
Anirban