Administration de systèmes UNIX Formation ARS

Transcription

1 Administration de systèmes UNIX Formation ARS Partie 3 Thierry Besançon Formation Permanente de l Université Pierre et Marie Curie c T.Besançon (v ) Administration UNIX ARS Partie 3 1 / 468

2 Chapitre 1 Ethernet Explications sur le pourquoi de ce chapitre Tout ce qu un administrateur système a jamais voulu savoir sur Ethernet. Certainement beaucoup de doublons avec ce qui aura été vu avec l enseignement RESEAU. c T.Besançon (v ) Administration UNIX ARS Partie 3 2 / Ethernet 1.1 Principe d Ethernet : CSMA/CD Chapitre 1 Ethernet 1.1 Principe d Ethernet : CSMA/CD Le principe d Ethernet : Carrier Sence Multiple Access / Collision Detect (CSMA/CD) 2 cas de figure : 1 Emission dans le cas du câble libre 2 Emission lorsque deux stations émettent simultanément = collision c T.Besançon (v ) Administration UNIX ARS Partie 3 3 / 468

3 1 Ethernet 1.1 Principe d Ethernet : CSMA/CD Emission dans le cas du câble libre c T.Besançon (v ) Administration UNIX ARS Partie 3 4 / Ethernet 1.1 Principe d Ethernet : CSMA/CD Collision lorsque deux stations émettent simultanément c T.Besançon (v ) Administration UNIX ARS Partie 3 5 / 468

4 1 Ethernet 1.2 Ethernet 10 Base 5 Chapitre 1 Ethernet 1.2 Ethernet 10 Base 5 Cablage obsolète " #! c T.Besançon (v ) Administration UNIX ARS Partie 3 6 / Ethernet 1.2 Ethernet 10 Base 5 DTE DTE DTE c T.Besançon (v ) Administration UNIX ARS Partie 3 7 / 468

5 1 Ethernet 1.2 Ethernet 10 Base 5 Un ensemble monté c T.Besançon (v ) Administration UNIX ARS Partie 3 8 / Ethernet 1.2 Ethernet 10 Base 5 Une prise vampire démontée c T.Besançon (v ) Administration UNIX ARS Partie 3 9 / 468

6 1 Ethernet 1.2 Ethernet 10 Base 5 Le cable AUI, une prise vampire et son transceiver c T.Besançon (v ) Administration UNIX ARS Partie 3 10 / Ethernet 1.2 Ethernet 10 Base 5 Connecteurs 10 Base 5 sur un mini-transceiver et un drop cable c T.Besançon (v ) Administration UNIX ARS Partie 3 11 / 468

7 1 Ethernet 1.2 Ethernet 10 Base 5 Carte combo 10Base5 et 10BaseT c T.Besançon (v ) Administration UNIX ARS Partie 3 12 / Ethernet 1.2 Ethernet 10 Base 5 Mini transceiver low profile 10Base5 - RJ45 c T.Besançon (v ) Administration UNIX ARS Partie 3 13 / 468

8 "! 1 Ethernet 1.2 Ethernet 10 Base 5 Mini transceiver low profile 10Base5 - RJ45 au dos d une machine c T.Besançon (v ) Administration UNIX ARS Partie 3 14 / Ethernet 1.3 Ethernet 10 Base 2 Chapitre 1 Ethernet 1.3 Ethernet 10 Base 2 Cablage obsolète # Male BNC 50 Ohm c T.Besançon (v ) Administration UNIX ARS Partie 3 15 / 468

9 1 Ethernet 1.3 Ethernet 10 Base 2 DTE DTE DTE DTE c T.Besançon (v ) Administration UNIX ARS Partie 3 16 / Ethernet 1.3 Ethernet 10 Base 2 c T.Besançon (v ) Administration UNIX ARS Partie 3 17 / 468

10 1 Ethernet 1.3 Ethernet 10 Base 2 Sertissage d une prise 10Base2 c T.Besançon (v ) Administration UNIX ARS Partie 3 18 / Ethernet 1.3 Ethernet 10 Base 2 c T.Besançon (v ) Administration UNIX ARS Partie 3 19 / 468

11 1 Ethernet 1.3 Ethernet 10 Base 2 c T.Besançon (v ) Administration UNIX ARS Partie 3 20 / Ethernet 1.3 Ethernet 10 Base 2 La prise sertie c T.Besançon (v ) Administration UNIX ARS Partie 3 21 / 468

12 1 Ethernet 1.3 Ethernet 10 Base 2 Raccordement : té 10Base2 c T.Besançon (v ) Administration UNIX ARS Partie 3 22 / Ethernet 1.3 Ethernet 10 Base 2 c T.Besançon (v ) Administration UNIX ARS Partie 3 23 / 468

13 1 Ethernet 1.3 Ethernet 10 Base 2 En fin de cable : terminateur 50 Ohms 10Base2 c T.Besançon (v ) Administration UNIX ARS Partie 3 24 / Ethernet 1.3 Ethernet 10 Base 2 Carte combo 10Base2 et 10BaseT c T.Besançon (v ) Administration UNIX ARS Partie 3 25 / 468

14 % 1 Ethernet 1.4 Ethernet 10 Base T, 100 Base T Chapitre 1 Ethernet 1.4 Ethernet 10 Base T, 100 Base T 10 Base T = Cablage obsolète % % % * & & & ' ( ) +, ) BNC!! "!! 2 3 & # $ ) & # 4 + ' ( 5 3 ) & & + ' ( +, ) & - &!,. & / - &!, & 0 -,. & 1 -, c T.Besançon (v ) Administration UNIX ARS Partie 3 26 / Ethernet 1.4 Ethernet 10 Base T, 100 Base T H U B c T.Besançon (v ) Administration UNIX ARS Partie 3 27 / 468

15 1 Ethernet 1.4 Ethernet 10 Base T, 100 Base T Prise à sertir c T.Besançon (v ) Administration UNIX ARS Partie 3 28 / Ethernet 1.4 Ethernet 10 Base T, 100 Base T Cable croisé pour relier deux ordinateurs entre eux ou deux switches c T.Besançon (v ) Administration UNIX ARS Partie 3 29 / 468

16 1 Ethernet 1.5 Format d une adresse Ethernet Chapitre 1 Ethernet 1.5 Format d une adresse Ethernet Format d une adresse Ethernet : 6 octets écrits sous la forme hexadécimale «xx:yy:zz:rr:ss:tt» avec : partie «xx:yy:zz» : elle identifie un constructeur partie «rr:ss:tt» : elle identifie un appareil chez le constructeur Liste des constructeurs : liste des OUI (Organizationally Unique Identifiers) : « c T.Besançon (v ) Administration UNIX ARS Partie 3 30 / Ethernet 1.5 Format d une adresse Ethernet Il existe une adresse de broadcast Ethernet : «ff:ff:ff:ff:ff:ff» Toutes les machines du segment Ethernet sont censées écouter le paquet. c T.Besançon (v ) Administration UNIX ARS Partie 3 31 / 468

17 1 Ethernet 1.6 Trouver son adresse Ethernet sur UNIX/LINUX Chapitre 1 Ethernet 1.6 Trouver son adresse Ethernet sur UNIX/LINUX Pour trouver son adresse Ethernet sur UNIX/LINUX : plusieurs méthodes : 1 repérer les périphériques dans la sortie de «dmesg» au moment du boot 2 utiliser la commande de configuration des interfaces pour visualiser les adresses Ethernet c T.Besançon (v ) Administration UNIX ARS Partie 3 32 / Ethernet 1.6 Trouver son adresse Ethernet sur UNIX/LINUX Méthode via «dmesg» # dmesg... Intel(R) PRO/1000 Network Driver - version k2-NAPI Copyright (c) Intel Corporation. e1000: 0000:02:01.0: e1000_probe: (PCI-X:66MHz:64-bit) 00:11:09:5a:f0:d8 e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection... e1000: 0000:02:01.1: e1000_probe: (PCI-X:66MHz:64-bit) 00:11:09:5a:f0:d9 e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection... Attention à ne pas attendre trop longtemps pour ne pas perdre le contenu de «dmesg». ligne 1 ligne 2 ligne 3... ligne N ligne 1 ligne 2 ligne 3... ligne N ligne 1 ligne 2 ligne 3 ligne 4... ligne N+1 ligne N+1 c T.Besançon (v ) Administration UNIX ARS Partie 3 33 / 468

18 1 Ethernet 1.6 Trouver son adresse Ethernet sur UNIX/LINUX Méthode via «ifconfig» # ifconfig -a eth0 Link encap:ethernet HWaddr 00:11:09:5A:F0:D8 inet addr: Bcast: Mask: inet6 addr: fe80::211:9ff:fe5a:f0d8/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets: errors:0 dropped:0 overruns:0 frame:0 TX packets: errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes: (2.6 GiB) TX bytes: (3.3 GiB) Base address:0x2000 Memory:d d eth1 Link encap:ethernet HWaddr 00:11:09:5A:F0:D9 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Base address:0x2040 Memory:d d C est la méthode la plus universelle. c T.Besançon (v ) Administration UNIX ARS Partie 3 34 / Ethernet 1.7 Trouver son adresse Ethernet sur WINDOWS Chapitre 1 Ethernet 1.7 Trouver son adresse Ethernet sur WINDOWS Méthode 1 : commande «getmac.exe -v» à partir de MICROSOFT Windows XP. c T.Besançon (v ) Administration UNIX ARS Partie 3 35 / 468

19 1 Ethernet 1.7 Trouver son adresse Ethernet sur WINDOWS Méthode plus classique et plus longue : c T.Besançon (v ) Administration UNIX ARS Partie 3 36 / Ethernet 1.7 Trouver son adresse Ethernet sur WINDOWS c T.Besançon (v ) Administration UNIX ARS Partie 3 37 / 468

20 1 Ethernet 1.7 Trouver son adresse Ethernet sur WINDOWS c T.Besançon (v ) Administration UNIX ARS Partie 3 38 / Ethernet 1.7 Trouver son adresse Ethernet sur WINDOWS Au passage, renommez l interface réseau fournie par WINDOWS lors de l installation! Le nom de l interface en français est difficile à utiliser dans des scripts à cause de la lettre accentuée : «Connexion au réseau local» (cf cours de Franck Rupin). Mettez «ETH0», «ETH1», etc. par exemple (selon la logique LINUX). c T.Besançon (v ) Administration UNIX ARS Partie 3 39 / 468

21 1 Ethernet 1.8 Format d une trame Ethernet Chapitre 1 Ethernet 1.8 Format d une trame Ethernet Trame Ethernet == Paquet Ethernet 62 bits 2 bits Série alternée de 0 et de 1 Série de 2 bits à 1 Taille de la trame : 18 octets d entête + au maximum 1500 octets de données = 1518 octets 6 bytes 6 bytes 2 bytes de 46 bytes à 1500 bytes 4 bytes Adresse de destination Adresse de l émetteur Longueur du paquet (standard 802.3) Type du paquet (standard Ethernet) Données Frame Check Sequence La capacité de 1500 octets est appelée MTU (Maximum Transmission Unit). Il existe des Jumbo frames : trame de 9000 octets de données. Souvent non supporté par les équipements réseau. A éviter. c T.Besançon (v ) Administration UNIX ARS Partie 3 40 / Ethernet 1.9 Address Resolution Protocol (ARP) Chapitre 1 Ethernet 1.9 Address Resolution Protocol (ARP) RFC 826 Le protocole ARP apporte la réponse à «comment dialoguer avec une autre machine IP du même brin Ethernet sans connaitre au préalable son adresse Ethernet». Synthétiquement le protocole fonctionne ainsi : 1 Je suis la machine d adresse IP1 et d adresse Ethernet MAC1. Ecoutez moi tous sur le brin Ethernet. Je veux dialoguer avec la machine d adresse IP2. Que la machine avec cette adresse IP me communique son adresse Ethernet MAC2. 2 J ai bien entendu. Je suis la machine avec IP2. Voici mon adresse Ethernet MAC2. 3 Les autres machines en profitent pour noter la réponse. c T.Besançon (v ) Administration UNIX ARS Partie 3 41 / 468

22 1 Ethernet 1.9 Address Resolution Protocol (ARP) QUI A L ADRESSE IP a.b.c.d? c T.Besançon (v ) Administration UNIX ARS Partie 3 42 / Ethernet 1.9 Address Resolution Protocol (ARP) J AI L ADRESSE IP a.b.c.d c T.Besançon (v ) Administration UNIX ARS Partie 3 43 / 468

23 1 Ethernet 1.10 Table ARP : commande arp, /proc/net/arp Chapitre 1 Ethernet 1.10 Table ARP : commande arp, /proc/net/arp On conserve les adresses apprises dans un cache en mémoire : la table ARP Pour connaitre le contenu de la table ARP : arp -a ou arp -an Exemple : % arp -a Net to Media Table: IPv4 Device IP Address Mask Flags Phys Addr eri0 solaris.example.org SP 00:03:ba:0f:15:35 c T.Besançon (v ) Administration UNIX ARS Partie 3 44 / Ethernet 1.10 Table ARP : commande arp, /proc/net/arp Autre façon d obtenir la table ARP d une machine LINUX : % cat /proc/net/arp IP address HW type Flags HW address Mask Device x1 0x2 00:02:7E:21:F7:9C * eth x1 0x2 00:48:54:6B:E5:B0 * eth3 c T.Besançon (v ) Administration UNIX ARS Partie 3 45 / 468

24 1 Ethernet 1.11 Surveillance ARP : commande arpwatch Chapitre 1 Ethernet 1.11 Surveillance ARP : commande arpwatch « ARPWATCH surveille les échanges du protocole ARP et stocke les adresses échangées. Attention : forte utilisation des switches désormais difficile d écouter les paquets qui ne nous sont pas destinés c T.Besançon (v ) Administration UNIX ARS Partie 3 46 / Ethernet 1.11 Surveillance ARP : commande arpwatch Détection d une nouvelle machine Date: Tue, 17 Feb :07: (CET) From: [email protected] (Arpwatch) To: [email protected] Subject: new station hostname: <unknown> ip address: ethernet address: 0:1e:68:be:93:32 ethernet vendor: <unknown> timestamp: Tuesday, February 17, :07: c T.Besançon (v ) Administration UNIX ARS Partie 3 47 / 468

25 1 Ethernet 1.11 Surveillance ARP : commande arpwatch Changement d adresse Ethernet sur une machine Date: Tue, 17 Feb :11: (CET) From: [email protected] (Arpwatch) To: [email protected] Subject: changed ethernet address (host dhcp.math.jussieu.fr) hostname: host dhcp.math.jussieu.fr ip address: ethernet address: 0:1e:68:be:93:32 ethernet vendor: <unknown> old ethernet address: 0:3:93:42:72:de old ethernet vendor: <unknown> timestamp: Tuesday, February 17, :11: previous timestamp: Tuesday, February 17, :22: delta: 2 hours c T.Besançon (v ) Administration UNIX ARS Partie 3 48 / Ethernet 1.11 Surveillance ARP : commande arpwatch Flip flop d adresses Date: Tue, 17 Feb :11: (CET) From: [email protected] (Arpwatch) To: [email protected] Subject: flip flop hostname: <unknown> ip address: ethernet address: 0:3:93:3:8e:b8 ethernet vendor: <unknown> old ethernet address: 0:1f:5b:f6:29:90 old ethernet vendor: <unknown> timestamp: Tuesday, February 17, :11: previous timestamp: Tuesday, February 17, :11: delta: 39 seconds c T.Besançon (v ) Administration UNIX ARS Partie 3 49 / 468

26 1 Ethernet 1.12 (Windows : : purge du cache ARP : netsh) Chapitre 1 Ethernet 1.12 (Windows : : purge du cache ARP : netsh) Une machine WINDOWS utilise un cache ARP comme n importe quelle autre machine faisant de l IP. On peut purger le cache ARP par la commande «netsh interface ip delete arpcache». c T.Besançon (v ) Administration UNIX ARS Partie 3 50 / Ethernet 1.13 Reverse Address Resolution Protocol (RARP) Chapitre 1 Ethernet 1.13 Reverse Address Resolution Protocol (RARP) Le protocole répond à «je suis la machine d adresse Ethernet MAC1 ; qui peut me donner mon adresse IP?». Exemple de telles machines : stations UNIX sans disque (dite diskless) terminaux X, clients légers imprimantes réseau webcams réseau boitiers ethernet/usb ou ethernet/parallèle pour imprimante c T.Besançon (v ) Administration UNIX ARS Partie 3 51 / 468

27 1 Ethernet 1.14 DHCP, dhcpd, dhcpd.conf, dhclient Chapitre 1 Ethernet 1.14 DHCP, dhcpd, dhcpd.conf, dhclient Question RARP : «je suis la machine d adresse Ethernet MAC1 ; qui peut me donner mon adresse IP?». Réponse : Dynamic Host Configuration Protocol (DHCP) RFC 2131 et RFC 2132 « « Port TCP 67 (port d écoute du serveur) Port TCP 68 (port de réponse du serveur) c T.Besançon (v ) Administration UNIX ARS Partie 3 52 / Ethernet 1.14 DHCP, dhcpd, dhcpd.conf, dhclient Protocole DHCP CLIENT DHCP DHCPDISCOVER (broadcast) SERVEUR DHCP DHCPOFFER (unicast) DHCPREQUEST (broadcast) DHCPACK (unicast) c T.Besançon (v ) Administration UNIX ARS Partie 3 53 / 468

28 1 Ethernet 1.14 DHCP, dhcpd, dhcpd.conf, dhclient Principe : 1 Une machine démarre dans l état «INIT» 2 Elle cherche un serveur DHCP en envoyant un paquet «DHCPDISCOVER». 3 Un ou plusieurs serveurs DHCP répondent par un paquet «DHCPOFFER» contenant une adresse IP et des options DHCP. 4 Le client sélectionne un serveur DHCP parmi ceux qui ont répondu. Par exemple, le premier. 5 Le client broadcaste un paquet «DHCPREQUEST» spécifiant l adresse IP retenue. 6 Le serveur retenu répond au client en lui envoyant un paquet «DHCPACK». 7 Après, le client posséde l adresse IP pour un laps de temps appelé lease. c T.Besançon (v ) Administration UNIX ARS Partie 3 54 / Ethernet 1.14 DHCP, dhcpd, dhcpd.conf, dhclient Lorsqu une machine qui a obtenu une adresse IP via DHCP reboote, elle ne recommence pas exactement les étapes ci-dessus. Elle commence dans l état «INIT-REBOOT». Elle envoie un paquet «DHCPREQUEST» reprenant l adresse précédemment acquise. En cas de disponibilité de l adresse, le serveur répond par «DHCPACK». En cas de non disponibilité (par exemple, le portable a changé de réseau), un serveur DHCP répond par «DHCPNACK». A ce moment-là, la machine reprend à l étape 1 de ci-dessus. Mécanisme de détection de duplicate IP address : le serveur DHCP envoie un paquet «ICMP Echo» ; en cas de réponse, le serveur propose une autre adresse le client DHCP envoie un paquet «ARP» ; en cas de réponse, le client envoie un paquet «DHCPDECLINE» ; le serveur proposera alors une nouvelle adresse c T.Besançon (v ) Administration UNIX ARS Partie 3 55 / 468

29 1 Ethernet 1.14 DHCP, dhcpd, dhcpd.conf, dhclient Exemple de dialogue DHCP Scenario : une machine A fonctionne en mode DHCP. On change sa carte réseau et la machine continue de fonctionner en mode DHCP. Au niveau du serveur DHCP, on voit : étape 1 -> DHCPREQUEST for from 00:0c:29:44:44:0d via eri0: lease unavailable. étape 2 -> DHCPNAK on to 00:0c:29:44:44:0d via eri0 étape 3 -> DHCPDISCOVER from 00:0c:29:44:44:0d via eri0 étape 4 -> DHCPOFFER on to 00:0c:29:44:44:0d via eri0 étape 5 -> DHCPREQUEST for ( ) from 00:0c:29:44:44:0d via eri0 étape 6 -> DHCPACK on to 00:0c:29:44:44:0d via eri0 c T.Besançon (v ) Administration UNIX ARS Partie 3 56 / Ethernet 1.14 DHCP, dhcpd, dhcpd.conf, dhclient Etapes : 1 la machine A réclame l adresse IP qu elle obtenait avec sa carte réseau précédente 2 le serveur refuse cette adresse car il attribue les adresses en mode statique (une adresse ethernet bien précise = une adresse IP bien précise) 3 la machine demande donc poliment une adresse IP au serveur DHCP 4 le serveur DHCP propose une adresse IP 5 la machine accepte la machine et confirme l adresse IP au serveur DHCP 6 le serveur DHCP prend note de l affectation de l adresse IP c T.Besançon (v ) Administration UNIX ARS Partie 3 57 / 468

30 1 Ethernet 1.14 DHCP, dhcpd, dhcpd.conf, dhclient Logiciel ISC DHCP « «ftp://ftp.isc.org/isc/dhcp/dhcp-3.0pl1.tar.gz» Fichier de configuration : en général «/etc/dhcp.conf» Pas de «SIGHUP» pour le reconfigurer en live. En cas de modification au fichier de configuration, il faut arrêter le démon «dhcpd» et le relancer. c T.Besançon (v ) Administration UNIX ARS Partie 3 58 / Ethernet 1.14 DHCP, dhcpd, dhcpd.conf, dhclient Configuration d une machine LINUX via DHCP Sur Red Hat Linux, on configure l interface «eth0» en mode DHCP via le fichier «/etc/sysconfig/network-scripts/ifcfg-eth0» (et ainsi de suite pour les autres interfaces «eth1», etc.) : Configuration en mode dynamique : DEVICE=eth0 HWADDR=D0:67:E5:E7:F2:D8 ONBOOT=yes BOOTPROTO = dhcp DHCPCLASS= Configuration en mode statique : DEVICE=eth0 HWADDR=D0:67:E5:E7:F2:D8 ONBOOT=yes BOOTPROTO = static IPADDR= NETMASK= Au boot, lancement du démon «dhclient» pour recevoir la configuration via DHCP (configuration possible via «/etc/dhclient.conf»). c T.Besançon (v ) Administration UNIX ARS Partie 3 59 / 468

31 1 Ethernet 1.14 DHCP, dhcpd, dhcpd.conf, dhclient Possibilité de relayage DHCP Sur routeurs CISCO par exemple : Serveur DHPC interface FastEthernet0/0 ip helper-address Client DHPC c T.Besançon (v ) Administration UNIX ARS Partie 3 60 / Ethernet 1.15 Mode promiscuous Chapitre 1 Ethernet 1.15 Mode promiscuous En théorie une carte réseau n écoute que les paquets qui lui sont destinés. Si une carte Ethernet est en mode promiscuous, elle peut capturer des paquets qui ne lui sont pas destinés. Utilité : pouvoir faire tourner un logiciel de capture de paquets réseau pouvoir faire tourner un logiciel comme VMWARE avec ses besoins spéciaux de connexions réseau c T.Besançon (v ) Administration UNIX ARS Partie 3 61 / 468

32 1 Ethernet 1.15 Mode promiscuous v s m e w r a v r e e r eth0 bridge vmnet0 eth1 bridge vmnet2 eth0 machine virtuelle eth1 vmnet3 eth2 bridge eth3 c T.Besançon (v ) Administration UNIX ARS Partie 3 62 / Ethernet 1.15 Mode promiscuous Au lancement d une machine virtuelle VMWARE sur LINUX (voir les logs via «dmesg» ou «syslog») : /dev/vmnet: open called by PID (vmware-vmx) device eth3 entered promiscuous mode bridge-eth3: enabled promiscuous mode /dev/vmnet: port on hub 3 successfully opened /dev/vmmon[31689]: host clock rate change request 0 -> 19 /dev/vmmon[31689]: host clock rate change request 19 -> 83 Lors de l arrêt de la machine virtuelle VMWARE (voir les logs via «dmesg» ou «syslog») : device eth3 left promiscuous mode bridge-eth3: disabled promiscuous mode c T.Besançon (v ) Administration UNIX ARS Partie 3 63 / 468

33 1 Ethernet 1.15 Mode promiscuous FREEBSD La commande «ifconfig» indique l état promiscuous en cas de lancement d un programme faisant passer en mode promiscuous : # ifconfig -a... bge1: flags=8943<up,broadcast,running,promisc,simplex,multicast> mtu 150 options=1b<rxcsum,txcsum,vlan_mtu,vlan_hwtagging> inet netmask 0xffffffc0 broadcast inet netmask 0xffffffff broadcast inet netmask 0xffff0000 broadcast ether 00:19:bb:21:2f:01 media: Ethernet autoselect (1000baseTX <full-duplex>) status: active... c T.Besançon (v ) Administration UNIX ARS Partie 3 64 / Ethernet 1.15 Mode promiscuous Le passage en mode promiscuous est aussi notifié par le noyau qui remonte l information via «dmesg» : # dmesg... bge1: promiscuous mode enabled... bge1: promiscuous mode disabled... c T.Besançon (v ) Administration UNIX ARS Partie 3 65 / 468

34 1 Ethernet 1.15 Mode promiscuous LINUX La commande «ifconfig» n indique pas l état promiscuous en cas de lancement d un programme faisant passer en mode promiscuous : # ifconfig -a... eth0 Link encap:ethernet HWaddr 00:11:09:5A:F0:D8 inet addr: Bcast: Mask: inet6 addr: fe80::211:9ff:fe5a:f0d8/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets: errors:0 dropped:0 overruns:0 frame:0 TX packets: errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes: (2.6 GiB) TX bytes: (3.3 GiB) Base address:0x2000 Memory:d d c T.Besançon (v ) Administration UNIX ARS Partie 3 66 / Ethernet 1.15 Mode promiscuous Le passage en mode promiscuous est notifié par le noyau qui remonte l information via «dmesg» : # dmesg... device eth0 entered promiscuous mode... device eth0 left promiscuous mode... c T.Besançon (v ) Administration UNIX ARS Partie 3 67 / 468

35 1 Ethernet 1.15 Mode promiscuous SOLARIS La commande «ifconfig» n indique pas l état promiscuous en cas de lancement d un programme faisant passer en mode promiscuous. c T.Besançon (v ) Administration UNIX ARS Partie 3 68 / Ethernet 1.16 Capture de trames Ethernet : librairie libpcap Chapitre 1 Ethernet 1.16 Capture de trames Ethernet : librairie libpcap (en anglais library packet capture) Librairie «libpcap». Cf « C est une bibliothèque de programmation C spécialisée dans la capture de paquets réseau. Elle repose sur un driver réseau présent dans le noyau, le packet filter BPF. c T.Besançon (v ) Administration UNIX ARS Partie 3 69 / 468

36 1 Ethernet 1.17 Capture de trames Ethernet : tcpdump Chapitre 1 Ethernet 1.17 Capture de trames Ethernet : tcpdump Outil de capture de trames Ethernet : «TCPDUMP». Cf « TCPDUMP est le logiciel de référence en ce qui concerne l analyse des trames IP circulant sur un réseau. Il est bâti au dessus de la libpcap qui fait tout le travail en fait. C est juste de l enrobage au dessus de libpcap. En cas de problème réseau, on utilisera ce logiciel si l origine du problème n est pas évidente. Exemple : # tcpdump -s 1500 host # tcpdump -s 1500 arp # tcpdump -s 1500 icmp # tcpdump -s 1500 dst sgbd.example.com port 5432 # tcpdump -s w fichier # tcpdump -s r fichier c T.Besançon (v ) Administration UNIX ARS Partie 3 70 / Ethernet 1.18 Capture de trames Ethernet : wireshark (ethereal) Chapitre 1 Ethernet 1.18 Capture de trames Ethernet : wireshark (ethereal) « Ancien nom : «ethereal» (« C est un logiciel graphique d analyse des trames IP circulant sur un réseau. On l utilise conjointement à tcpdump : 1 on demande à «tcpdump» d enregistrer les trames : «tcpdump -s w enregistrement» 2 on demande à «ethereal» de relire a posteriori ce fichier d enregistrement : «ethereal enregistrement» Disponible sur WINDOWS aussi. c T.Besançon (v ) Administration UNIX ARS Partie 3 71 / 468

37 1 Ethernet 1.18 Capture de trames Ethernet : wireshark (ethereal) c T.Besançon (v ) Administration UNIX ARS Partie 3 72 / Ethernet 1.19 Wake On Lan Chapitre 1 Ethernet 1.19 Wake On Lan Sur les machines modernes, la carte réseau reste alimentée électriquement la carte réseau peut alors démarrer la carte mère sur réception de paquets réseau spéciaux et démarrer l OS = Wake On Lan c T.Besançon (v ) Administration UNIX ARS Partie 3 73 / 468

38 1 Ethernet 1.19 Wake On Lan c T.Besançon (v ) Administration UNIX ARS Partie 3 74 / Ethernet 1.19 Wake On Lan Manifestement dépend aussi du driver de la carte réseau : Manifestement dépend aussi de la bonne qualité du driver de la carte réseau : Salle de TP de la Formation Permanente Carte mère ASUS P4P800X Boot de Windows ; shutdown propre WOL possible Boot de Mandriva 2006 ; shutdown propre WOL impossible c T.Besançon (v ) Administration UNIX ARS Partie 3 75 / 468

39 1 Ethernet 1.19 Wake On Lan Comment contacter une machine? Envoi d un paquet Ethernet spécial : paquet UDP à destination du port «discard» (numéro 9) le plus souvent contenu du paquet UDP (dit magic sequence) : 6 fois «0xFF» 16 fois l adresse Ethernet Par exemple, pour réveiller la carte d adresse «01:02:03:04:05:06» : FFFFFFFFFFFF c T.Besançon (v ) Administration UNIX ARS Partie 3 76 / Ethernet 1.19 Wake On Lan Logiciels : «wakeonlan» : «wakeonlan» : «java -jar wakeonlan.jar -i :02:03:04:05:06» c T.Besançon (v ) Administration UNIX ARS Partie 3 77 / 468

40 Chapitre 2 Protocole IP Explications sur le pourquoi de ce chapitre Tout ce qu un administrateur système a jamais voulu savoir sur TCP/IP. Certainement beaucoup de doublons avec ce qui aura été vu avec l enseignement RESEAU. c T.Besançon (v ) Administration UNIX ARS Partie 3 78 / Protocole IP 2.1 IP v4 / IP v6 Chapitre 2 Protocole IP 2.1 IP v4 / IP v6 Plusieurs versions de TCP/IP : 1 IP version 4 : la version historique, la plus répandue 2 IP version 6 : la version du futur, pas encore déployée massivement mais en cours de déployement (par exemple disponible sur FreeBOX) c T.Besançon (v ) Administration UNIX ARS Partie 3 79 / 468

41 2 Protocole IP 2.2 Adresses IP Chapitre 2 Protocole IP 2.2 Adresses IP Quelques caractéristiques : protocole IP version 4 adresse IP sur 4 octets «a.b.c.d» a, b, c, d sont compris entre 0 et 255 et écrits en base 10 pour éviter des erreurs % man 3 inet... All numbers supplied as parts in a. notation may be decimal, octal, or hexadecimal, as specified in the C language (i.e., a leading 0x or 0X implies hexadecimal; otherwise, a leading 0 implies octal; otherwise, the number is interpreted as decimal).... des organismes attribuent des lots d adresses aux sociétés (pour la France AFNIC : « c T.Besançon (v ) Administration UNIX ARS Partie 3 80 / Protocole IP 2.3 Adresses de réseaux : classes A, B et C Chapitre 2 Protocole IP 2.3 Adresses de réseaux : classes A, B et C Notion obsolète de classes d adresses : classe A, classe B et classe C : Classe A : on parle maintenant d un «/8» 7 bits 24 bits 0 netid hostid à Classe B : on parle maintenant d un «/16» 14 bits 16 bits 1 0 netid hostid à Classe C : on parle maintenant d un «/24» 21 bits 8 bits netid hostid à c T.Besançon (v ) Administration UNIX ARS Partie 3 81 / 468

42 2 Protocole IP 2.4 Adresses de réseaux : sous-réseaux, écriture CIDR Chapitre 2 Protocole IP 2.4 Adresses de réseaux : sous-réseaux, écriture CIDR (en anglais subnets) Un sous-réseau = découpage plus fin ou plus large que les classes A, B ou C. Le principe reste le même : une longueur N de bits imposés une longueur 32 - N de bits laissés variables Ecriture d une adresse réseau : «adresse-du-réseau/n» (écriture dite CIDR, Classless Inter Domain Routing) c T.Besançon (v ) Administration UNIX ARS Partie 3 82 / Protocole IP 2.4 Adresses de réseaux : sous-réseaux, écriture CIDR Par exemple : « /25» Signification : 25 bits imposés = 7 bits variables = < ><- 7 -> Conclusion : les adresses disponibles sur ce subnet vont de « » ( ) à « » ( ). c T.Besançon (v ) Administration UNIX ARS Partie 3 83 / 468

43 2 Protocole IP 2.4 Adresses de réseaux : sous-réseaux, écriture CIDR Combien d adresses IP utilisables par sous-réseau? Réponse : si N bits variables alors 2 N 2 : l adresse avec tous les bits variables à 0 est réservée pour désigner le sous-réseau l adresse avec tous les bits variables à 1 est réservée pour désigner l adresse de broadcast (voir page 92) Par exemple : « /25» 128 adresses dans le sous-réseau = = 126 adresses utilisables pour des machines c T.Besançon (v ) Administration UNIX ARS Partie 3 84 / Protocole IP 2.4 Adresses de réseaux : sous-réseaux, écriture CIDR Combien d adresses IP utilisables par sous-réseau? Masque Nombres d adresses Nombre d adresses utilisables CIDR / = = / = = / = = 254 / = = 126 / = = 62 / = = 30 / = = 14 / = = 6 / = = 2 / = = 0 c T.Besançon (v ) Administration UNIX ARS Partie 3 85 / 468

44 2 Protocole IP 2.5 Masque réseau, netmask Chapitre 2 Protocole IP 2.5 Masque réseau, netmask Le problème : comment la station A construit-elle les paquets Ethernet pour dialoguer avec la machine B, où que soit la station B? Deux cas de figure : 1 A et B sont sur le même réseau Ethernet local : A peut envoyer un paquet Ethernet directement à B 2 A et B ne sont pas sur le même réseau Ethernet local : A doit passer par un routeur intermédiaire. Il faut alors construire un paquet avec pour adresse Ethernet de destination l adresse Ethernet du routeur et non pas avec l adresse Ethernet de B. c T.Besançon (v ) Administration UNIX ARS Partie 3 86 / Protocole IP 2.5 Masque réseau, netmask Comment diagnostiquer si A et B sont sur le même réseau Ethernet ou pas? La solution : A et B sont sur le même réseau physique si IP(A) et IP(B) partagent une même propriété : avoir la même adresse de réseau. On calcule l adresse de réseau d une adresse A en appliquant un masque de bits sur l adresse IP de A. Le masque de bits est appelé le masque de réseau de A (ou netmask). IP(A) & netmask(a) = adresse-réseau(a) c T.Besançon (v ) Administration UNIX ARS Partie 3 87 / 468

45 2 Protocole IP 2.5 Masque réseau, netmask Comment diagnostiquer si A et B sont sur le même réseau Ethernet ou pas? En l occurence si : IP(A) & netmask(a) = IP(B) & netmask(a) (avec «&» désignant le «ET logique») Rappel sur le ET logique : bit A bit B bit A ET bit B c T.Besançon (v ) Administration UNIX ARS Partie 3 88 / Protocole IP 2.5 Masque réseau, netmask Le netmask se construit ainsi : la partie fixe des bits est mise à 1 la partie variable des bits est mise à 0 Bits Masque c T.Besançon (v ) Administration UNIX ARS Partie 3 89 / 468

46 2 Protocole IP 2.5 Masque réseau, netmask Exemple 1 : réseau « /25» Netmask : « » A : A : netmask(a) : < ><- 7 -> A & netmask(a) : B : B : netmask(a) : < ><- 7 -> B & netmask(a) : Donc les machines sont sur le même réseau. c T.Besançon (v ) Administration UNIX ARS Partie 3 90 / Protocole IP 2.5 Masque réseau, netmask Exemple 2 : réseau « /25» Netmask : « » A : A : netmask(a) : < ><- 7 -> A & netmask(a) : B : B : netmask(a) : < ><- 7 -> B & netmask(a) : Donc les machines ne sont pas sur le même réseau. c T.Besançon (v ) Administration UNIX ARS Partie 3 91 / 468

47 2 Protocole IP 2.6 Adresse de broadcast IP Chapitre 2 Protocole IP 2.6 Adresse de broadcast IP Chaque machine IP écoute un paquet IP avec l adresse de broadcast pour adresse de destination et répond peut-être suivant le type du paquet. Construction de l adresse de broadcast d une machine A dans le réseau «réseau/n» : les bits de la longueur N de bits restent inchangés les 32 - N bits (variables dans le subnet) sont mis à 1 c T.Besançon (v ) Administration UNIX ARS Partie 3 92 / Protocole IP 2.6 Adresse de broadcast IP Exemple 1 : réseau « /24» A : A : < ><- 8 --> netmask(a) : broadcast(a) : broadcast(a) : c T.Besançon (v ) Administration UNIX ARS Partie 3 93 / 468

48 2 Protocole IP 2.6 Adresse de broadcast IP Exemple 2 : réseau « /25» A : A : < ><- 7 -> netmask(a) : broadcast(a) : broadcast(a) : c T.Besançon (v ) Administration UNIX ARS Partie 3 94 / Protocole IP 2.6 Adresse de broadcast IP Exemple 3 : réseau « /26» A : A : < ><- 6-> netmask(a) : broadcast(a) : broadcast(a) : c T.Besançon (v ) Administration UNIX ARS Partie 3 95 / 468

49 2 Protocole IP 2.7 Unicast, Broadcast, Multicast Chapitre 2 Protocole IP 2.7 Unicast, Broadcast, Multicast Diffusion unicast c T.Besançon (v ) Administration UNIX ARS Partie 3 96 / Protocole IP 2.7 Unicast, Broadcast, Multicast Diffusion broadcast c T.Besançon (v ) Administration UNIX ARS Partie 3 97 / 468

50 2 Protocole IP 2.7 Unicast, Broadcast, Multicast Diffusion multicast c T.Besançon (v ) Administration UNIX ARS Partie 3 98 / Protocole IP 2.8 Adresse spéciale : adresse de loopback Chapitre 2 Protocole IP 2.8 Adresse spéciale : adresse de loopback Interface virtuelle de loopback d adresse IP « » L adresse « / 8» est appelée «localhost». Permet de faire des connexions réseau avec soi-même. INTERNET APPL1 Adresse IP a.b.c.d Adresse IP APPL2 c T.Besançon (v ) Administration UNIX ARS Partie 3 99 / 468

51 2 Protocole IP 2.8 Adresse spéciale : adresse de loopback L adresse de loopback existe sur UNIX/LINUX, sur APPLE, sur WINDOWS. Sous LINUX, l interface de loopback s appelle «lo». # ifconfig lo lo Link encap:local Loopback inet addr: Mask: inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:8792 errors:0 dropped:0 overruns:0 frame:0 TX packets:8792 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes: (1.1 MiB) TX bytes: (1.1 MiB) Sous SOLARIS, l interface de loopback s appelle «lo0». c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.8 Adresse spéciale : adresse de loopback Sur WINDOWS, on ne voit pas l adresse de loopback dans la liste des interfaces mais elle existe bien pourtant : c T.Besançon (v ) Administration UNIX ARS Partie / 468

52 2 Protocole IP 2.9 Adresses spéciales : adresses privées / RFC 1918 Chapitre 2 Protocole IP 2.9 Adresses spéciales : adresses privées / RFC 1918 RFC 1918 : adresses privées : « /8» « /12» « /16» Adresses utilisables sur des réseaux sans interconnexion avec Internet ou avec NAT (Network Address Translation ; par exemple : boites ADSL). c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.10 Adresses spéciales : autres adresses / RFC 3330 Chapitre 2 Protocole IP 2.10 Adresses spéciales : autres adresses / RFC 3330 RFC 3330 : Adresses Réseau Documentation « /8» "This" Network RFC1700, page 4 « /8» Private-Use Networks RFC1918 « /8» Public-Data Networks RFC1700, page 181 « /8» Cable Television Networks « /8» Reserved but subject to allocation RFC1797 « /8» Loopback RFC1700, page 5 « /16» Reserved but subject to allocation « /16» Link Local « /12» Private-Use Networks RFC1918 « /16» Reserved but subject to allocation c T.Besançon (v ) Administration UNIX ARS Partie / 468

53 2 Protocole IP 2.10 Adresses spéciales : autres adresses / RFC 3330 RFC 3330 (suite) : Adresses Réseau Documentation « /24» Reserved but subject to allocation « /24» Test-Net « /24» 6to4 Relay Anycast RFC3068 « /16» Private-Use Networks RFC1918 « /15» Network Interconnect Device RFC2544 Benchmark Testing « /24» Reserved but subject to allocation « /4» Multicast RFC3171 « /4» Reserved for Future Use RFC1700, page 4 c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.11 Encapsulation des paquets Chapitre 2 Protocole IP 2.11 Encapsulation des paquets Principe des poupées russes. c T.Besançon (v ) Administration UNIX ARS Partie / 468

54 2 Protocole IP 2.11 Encapsulation des paquets Trame Ethernet header Ethernet data trailer Ethernet c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.11 Encapsulation des paquets Trame Ethernet header Ethernet Trame ARP trailer Ethernet c T.Besançon (v ) Administration UNIX ARS Partie / 468

55 2 Protocole IP 2.11 Encapsulation des paquets Trame Ethernet header Ethernet Trame IP trailer Ethernet c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.11 Encapsulation des paquets Trame Ethernet header Ethernet Trame IP trailer Ethernet header IP Trame UDP c T.Besançon (v ) Administration UNIX ARS Partie / 468

56 2 Protocole IP 2.11 Encapsulation des paquets Trame Ethernet header Ethernet Trame IP trailer Ethernet header IP Trame TCP c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.11 Encapsulation des paquets Trame Ethernet header Ethernet Trame IP trailer Ethernet header IP Trame ICMP c T.Besançon (v ) Administration UNIX ARS Partie / 468

57 2 Protocole IP 2.12 Configuration d adresse IP : ifconfig Chapitre 2 Protocole IP 2.12 Configuration d adresse IP : ifconfig (en anglais interface configuration) La commande «ifconfig» sert à régler les paramètres des cartes réseau : adresse IP adresse de netmask IP adresse de broadcast IP interface ON / OFF adresses virtuelles, etc. ATTENTION : La commande «ifconfig» utilise les noms des interfaces. Il faut trouver ces noms! c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.12 Configuration d adresse IP : ifconfig Configuration manuelle sur LINUX # ifconfig eth0 inet # ifconfig eth0 netmask 0xffff0000 # ifconfig eth0 netmask # ifconfig eth0 broadcast # ifconfig eth0 inet /16 <-- écriture plus compacte # ifconfig -a eth0 Link encap:ethernet HWaddr 00:22:19:68:BB:E5 inet addr: Bcast: Mask: inet6 addr: fe80::222:19ff:fe68:bbe5/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets: errors:0 dropped:0 overruns:0 frame:0 TX packets: errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes: (1.1 GiB) TX bytes: (436.8 MiB) Interrupt:90 Memory:d d c T.Besançon (v ) Administration UNIX ARS Partie / 468

58 2 Protocole IP 2.12 Configuration d adresse IP : ifconfig Politique des noms des cartes réseau sur LINUX «lo» : interface de loopback interface physique 1 : «eth0» interface physique 2 : «eth1» interface physique 3 : «eth2» etc. Rappel : la commande «dmesg» renvoie la liste des interfaces : # dmesg... eth0: Digital DS21143 Tulip rev 65 at 0xf8a96000, 00:80:C8:C9:83:F9, IRQ c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.12 Configuration d adresse IP : ifconfig Politique des noms des cartes réseau sur SOLARIS les cartes portent le nom des puces électroniques plus un numéro : «lo0» : interface de loopback «le0» : puce Lance Ethernet «qfe0» : puce Quad Fast Ethernet «eri0» : puce 10/100Mbit RIO Gem Chip «e1000g» : puce Intel Pro/1000 «hme0» : puce?? etc. Sur SOLARIS, vous pouvez bien sûr consulter les pages de manuel de ces cartes : «man hme», «man qfe», «man e1000g», «man ce», «man eri», «man bge», etc.) c T.Besançon (v ) Administration UNIX ARS Partie / 468

59 2 Protocole IP 2.12 Configuration d adresse IP : ifconfig Configuration au boot sur LINUX Sur une machine Linux, la paramètrage réseau des cartes se trouve au niveau des fichiers suivants : «eth0» : «/etc/sysconfig/network-scripts/ifcfg-eth0» «eth1» : «/etc/sysconfig/network-scripts/ifcfg-eth1» etc. Par exemple : # cat /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 ONBOOT=yes BOOTPROTO=static IPADDR= BROADCAST= NETMASK= c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.13 (Windows 98 : : winipcfg.exe) Chapitre 2 Protocole IP 2.13 (Windows 98 : : winipcfg.exe) La commande «winipcfg.exe» permet de connaitre en mode graphique la configuration réseau des interfaces. c T.Besançon (v ) Administration UNIX ARS Partie / 468

60 2 Protocole IP 2.13 (Windows 98 : : winipcfg.exe) c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.14 (Windows : : ipconfig.exe) Chapitre 2 Protocole IP 2.14 (Windows : : ipconfig.exe) La commande «ipconfig.exe» permet de connaitre en mode ligne de commande la configuration réseau des interfaces. interfaces DHCP etc. c T.Besançon (v ) Administration UNIX ARS Partie / 468

61 2 Protocole IP 2.14 (Windows : : ipconfig.exe) c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.15 (Windows : : netsh.exe) Chapitre 2 Protocole IP 2.15 (Windows : : netsh.exe) La commande «netsh» permet de configurer en mode ligne de commande beaucoup d aspects réseau : configuration des interfaces configuration DHCP configuration du firewall Microsoft configuration WIFI configuration IPSEC etc. c T.Besançon (v ) Administration UNIX ARS Partie / 468

62 2 Protocole IP 2.16 Comment connaitre son adresse IP pour les nuls Chapitre 2 Protocole IP 2.16 Comment connaitre son adresse IP pour les nuls Se connecter sur le site web « De nombreux autres sites web proposent cela aussi. Attention pour que cela fonctionne, votre trafic web ne doit pas passer par un proxy web. c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.17 Principe des adresses IP virtuelles Chapitre 2 Protocole IP 2.17 Principe des adresses IP virtuelles En plus des interfaces réseau physiques, on peut avoir des interfaces réseau virtuelles. Utilité On associe un service réseau à une interface virtuelle qui aura son propre numéro IP sur un serveur qui a déjà une interface physique. On a donc plusieurs adresses IP sur le serveur en pratique. Si le service a besoin d être déplacé de serveur, on déplace l interface virtuelle sur un autre serveur qui peut continuer ses autres rôles. c T.Besançon (v ) Administration UNIX ARS Partie / 468

63 2 Protocole IP 2.17 Principe des adresses IP virtuelles serveur eth eth0:1 eth0: service reseau A service reseau B c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.17 Principe des adresses IP virtuelles serveur 1 eth serveur 2 eth eth0:1 eth0: service reseau A service reseau B c T.Besançon (v ) Administration UNIX ARS Partie / 468

64 2 Protocole IP 2.18 Configuration d adresses IP virtuelles sur LINUX : ifconfig Chapitre 2 Protocole IP 2.18 Configuration d adresses IP virtuelles sur LINUX : ifconfig Sur LINUX, on ajoute des indices «:0», «:1», «:2», etc. derrière le nom de l interface physique sur laquelle on veut ajouter une adresse IP virtuelle. Par exemple : «eth0:0», «eth0:1», «eth0:2», On peut prendre n importe quel indice d adresse virtuelle. Par exemple on peut utiliser «eth0:11» sans avoir pris les autres sous index de 0 à 10. c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.18 Configuration d adresses IP virtuelles sur LINUX : ifconfig Configuration manuelle d une interface virtuelle sur LINUX # ifconfig eth3:1 inet /26 up # ifconfig -a... eth3 Link encap:ethernet HWaddr 00:22:19:68:BB:EB inet6 addr: fe80::222:19ff:fe68:bbeb/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets: errors:0 dropped:0 overruns:0 frame:0 TX packets: errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes: (241.9 MiB) TX bytes: (440.2 MiB) Interrupt:114 Memory:dc dc eth3:1 Link encap:ethernet HWaddr 00:22:19:68:BB:EB inet addr: Bcast: Mask: UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:114 Memory:dc dc On supprime l interface virtuelle via : «ifconfig eth3:1 down». c T.Besançon (v ) Administration UNIX ARS Partie / 468

65 2 Protocole IP 2.18 Configuration d adresses IP virtuelles sur LINUX : ifconfig Configuration au boot sur LINUX Sur une machine Linux, la paramètrage réseau des cartes virtuelles se trouve au niveau des fichiers suivants : «eth0:0» : «/etc/sysconfig/network-scripts/ifcfg-eth0:0» «eth0:1» : «/etc/sysconfig/network-scripts/ifcfg-eth0:1» etc. Par exemple : # cat /etc/sysconfig/network-scripts/ifcfg-eth0:1 DEVICE=eth0:1 ONBOOT=yes BOOTPROTO=static IPADDR= BROADCAST= NETMASK= c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.19 Connexions IP / ports IP Chapitre 2 Protocole IP 2.19 Connexions IP / ports IP Une connexion IP est constituée des éléments suivants : une adresse IP source un numéro de port source sur la machine de départ une adresse IP de destination un numéro de port sur la machine de destination protocole TCP ou UDP port 1 IP1 UDP ou TCP IP2 port 2 c T.Besançon (v ) Administration UNIX ARS Partie / 468

66 2 Protocole IP 2.19 Connexions IP / ports IP En pratique il y a 3 catégories de port : Les Well Known Ports de 0 à 1023 Les Registered Ports de 1024 à Les Dynamic and/or Private Ports de à Attention : Sur UNIX, la fonction C obtenant un port source < 1024 ne fonctionne que pour l UID 0 d où l utilisation du SetUID 0. Sur Windows, la fonction C obtenant un port source < 1024 fonctionne quel que soit l UID c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.20 Fichier /etc/services Chapitre 2 Protocole IP 2.20 Fichier /etc/services Le fichier «/etc/services» mentionne des triplets (numéro de port, protocole, nom du service). On peut obtenir un triplet officiellement pour un programme à soi auprès de l IANA « ATTENTION : Le fichier «/etc/services» n indique pas les services réseau activés sur la machine. Le fichier «/etc/services» sert à convertir un port numérique en un nom symbolique plus parlant. Voir les fonctions C «getservbyname()», «getservbyport()», «getservent()». c T.Besançon (v ) Administration UNIX ARS Partie / 468

67 2 Protocole IP 2.20 Fichier /etc/services Extrait d un fichier «/etc/services» :... chargen 19/tcp ttytst source #Character Generator chargen 19/udp ttytst source #Character Generator ftp-data 20/tcp #File Transfer [Default Data] ftp-data 20/udp #File Transfer [Default Data] ftp 21/tcp #File Transfer [Control] ftp 21/udp #File Transfer [Control] ssh 22/tcp #Secure Shell Login ssh 22/udp #Secure Shell Login telnet 23/tcp telnet 23/udp smtp 25/tcp mail #Simple Mail Transfer smtp 25/udp mail #Simple Mail Transfer... c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.21 Liste des ports réseau actifs : netstat -a, netstat -an Chapitre 2 Protocole IP 2.21 Liste des ports réseau actifs : netstat -a, netstat -an La commande «netstat -a» renvoie la liste des connexions réseau établies ou en attente d établissement de connexio :n. Les noms de ports affichés proviennent de «/etc/services» (via les fonctions C «getservbyname()», «getservbyport()», «getservent()»). % netstat -a Active Internet connections (including servers) Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp4 0 0 *.ssh *.* LISTEN tcp *.ssh *.* LISTEN udp4 0 0 *.syslog *.* udp6 0 0 *.syslog *.* udp4 0 0 *.bootpc *.* Active UNIX domain sockets... c T.Besançon (v ) Administration UNIX ARS Partie / 468

68 2 Protocole IP 2.21 Liste des ports réseau actifs : netstat -a, netstat -an Il existe l option «-n». La commande «netstat -a -n» renvoie la liste des connexions réseau établies ou en attente d établissement de connexion. Il n y a pas de traduction des numéros de port en noms symboliques trouvés dans «/etc/services». Les numéros de ports sont affichés sous forme numérique : % netstat -an Active Internet connections (including servers) Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp4 0 0 *.22 *.* LISTEN tcp *.22 *.* LISTEN udp4 0 0 *.514 *.* udp6 0 0 *.514 *.* udp4 0 0 *.68 *.* Active UNIX domain sockets... Préférer cette version plus parlante au niveau réseau. c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.21 Liste des ports réseau actifs : netstat -a, netstat -an Avantages et inconvénients de la commande «netstat» On récupère la liste des ports ouverts sans savoir quels programmes écoutent sur ces ports. La commande «netstat» est pratique si on sait ce que l on doit voir (par exemple pour vérifier qu un programme tourne correctement et s est mis en écoute du port réseau prévu). Si la commande «netstat» annonce un port sur lequel on écoute, cela n indique pas quel programme s est mis en écoute sur ce port. Il faudra alors utiliser un autre utilitaire pour découvrir qui écoute sur ce port. c T.Besançon (v ) Administration UNIX ARS Partie / 468

69 2 Protocole IP 2.22 (Windows : : Liste des ports réseau actifs : netstat.exe) Chapitre 2 Protocole IP 2.22 (Windows : : Liste des ports réseau actifs : netstat.exe) Une commande «netstat.exe» existe sur WINDOWS. Même principe, même fonctionnement que sur UNIX. c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.23 Liste des connexions réseau : lsof Chapitre 2 Protocole IP 2.23 Liste des connexions réseau : lsof Le problème : comment identifier quel programme écoute sur un port que «netstat -an» annonce comme ouvert? Solution : «lsof» (en anglais List of open files) Site : ftp://vic.cc.purdue.edu/pub/tools/unix/lsof/ «lsof» permet de connaître les filedescriptors (locaux ou réseau) ouverts sur une machine UNIX. File descriptor local : par exemple, pour voir quels processus utilisent la partition «/var/run» : % lsof /var/run COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME lpd 410 daemon 6u VREG 0, /var/run (swap) dhcpd root 6w VREG 0, /var/run (swap) c T.Besançon (v ) Administration UNIX ARS Partie / 468

70 2 Protocole IP 2.23 Liste des connexions réseau : lsof File descriptor réseau : par exemple pour voir qui utilise une certaine connexion TCP : % lsof -i tcp:32771 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME inetd 320 root 18u IPv4 0x f350 0t0 TCP *:32771 (LISTEN) File descriptor réseau : par exemple pour voir qui utilise une certaine connexion UDP : % lsof -i UDP@ :3853 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ss_logd 232 root 3u IPv4 0x30001d961c0 0t0 UDP localhost:3853 (Idle) (format «[protocol][@hostname hostaddr][:service port]») c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.24 (Windows : : Liste des connexions réseau : tcpview.exe) Chapitre 2 Protocole IP 2.24 (Windows : : Liste des connexions réseau : tcpview.exe) Logiciel TCPVIEW sur « c T.Besançon (v ) Administration UNIX ARS Partie / 468

71 2 Protocole IP 2.24 (Windows : : Liste des connexions réseau : tcpview.exe) TCPVIEW a une version en ligne de commande : TCPVCON c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.25 Commande de connexion : telnet Chapitre 2 Protocole IP 2.25 Commande de connexion : telnet Syntaxe : telnet host Exemples d utilisation : connexion à une machine UNIX connexion à une imprimante connexion à un équipement réseau etc. A chaque fois que ce sera possible, préférer une connexion shell distante en utilisant SSH car «telnet» envoie le mot de passe en clair sur le réseau. c T.Besançon (v ) Administration UNIX ARS Partie / 468

72 2 Protocole IP 2.25 Commande de connexion : telnet Autre syntaxe importante : telnet host port Exemples d utilisation : connexion manuelle à un serveur POP connexion manuelle à un serveur IMAP connexion manuelle à un serveur HTTP etc. Syntaxe non remplaçable par SSH. D où l utilité de «telnet» même encore maintenant. c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.25 Commande de connexion : telnet Exemple : connexion à une machine UNIX % telnet server.example.com Trying Connected to server.example.com. Escape character is ^]. SunOS 5.7 login: besancon Password: XXXXXXXX Last login: Sun Oct 12 15:18:22 from ppp-3 Sun Microsystems Inc. SunOS 5.5 Generic November 1995 server% Ce cas de connexion est totalement obsolète. Il faut utiliser à la place la commande SSH. c T.Besançon (v ) Administration UNIX ARS Partie / 468

73 2 Protocole IP 2.25 Commande de connexion : telnet Exemple : connexion à une imprimante % telnet hp4100.example.com Trying Connected to hp4100.example.com. Escape character is ^]. HP JetDirect Password: XXXXXXXX You are logged in Please type "?" for HELP, or "/" for current settings > / ===JetDirect Telnet Configuration=== Firmware Rev. : G MAC Address : 00:30:c1:0a:45:b2 Config By : USER SPECIFIED c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.25 Commande de connexion : telnet > exit IP Address : Subnet Mask : Default Gateway : Syslog Server : Not Specified Idle Timeout : 90 Seconds Set Cmnty Name : Not Specified Host Name : Not Specified Default Get Cmnty : Enabled DHCP Config : Disabled Passwd : Enabled IPX/SPX : Disabled DLC/LLC : Disabled Ethertalk : Enabled Banner page : Disabled EXITING WITHOUT SAVING ANY ENTRIES > Connection closed by foreign host. c T.Besançon (v ) Administration UNIX ARS Partie / 468

74 2 Protocole IP 2.25 Commande de connexion : telnet Interruption d une connexion par Ctrl-] (Control crochet fermant) % telnet obsolete.example.com Trying Connected to obsolete.example.com. Escape character is ^]. telnet login: besancon Password: XXXXXXXX Login incorrect login: ^] <-- taper Ctrl-] telnet> quit Connection closed. c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.25 Commande de connexion : telnet Tentative de connexion à une machine sans telnet % telnet notelnet.example.com Trying telnet: connect to address : Connection refused telnet: Unable to connect to remote host c T.Besançon (v ) Administration UNIX ARS Partie / 468

75 2 Protocole IP 2.25 Commande de connexion : telnet Humour Faire «telnet towel.blinkenlights.nl» c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.26 (Windows : : Commande de connexion : telnet.exe) Chapitre 2 Protocole IP 2.26 (Windows : : Commande de connexion : telnet.exe) Programme «telnet.exe» : équivalent à «telnet» d UNIX. c T.Besançon (v ) Administration UNIX ARS Partie / 468

76 2 Protocole IP 2.27 Duplicate IP address Chapitre 2 Protocole IP 2.27 Duplicate IP address Un problème régulier : le duplicate IP address : Deux machines sur le même réseau Ethernet utilisent en même temps la même adresse IP. Rien ne peut empêcher un utilisateur de prendre une adresse déjà en service (sauf à ne pas lui donner les droits administrateur de son poste de travail). Le pire scenario : l usurpateur prend l adresse du routeur par défaut ou celui d un serveur stratégique. Solution : tracer les adresses Ethernet au niveau des tables ARP des switches c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.27 Duplicate IP address Symptôme d un duplicate IP address sur un Apple MacOS X : c T.Besançon (v ) Administration UNIX ARS Partie / 468

77 2 Protocole IP 2.28 IP v6 Chapitre 2 Protocole IP 2.28 IP v6 1.6e+09 Total address space allocated 1.4e e+09 Addresses allocated 1e+09 8e+08 6e+08 4e+08 2e Year c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.28 IP v6 5e+09 Total address space allocated 4.5e+09 4e e+09 Addresses allocated 3e e+09 2e e+09 1e+09 5e Year Epuisement prévu des numéros IP création de IP v6. c T.Besançon (v ) Administration UNIX ARS Partie / 468

78 2 Protocole IP 2.28 IP v6 From: James Carlson To: Subject: Re: (IPng) GENERAL IPNG ISSUES Date: Mon, 26 Sep 94 07:29: >> PS why do people say that 16 bytes is enough to address the people >> on the entire planet squillions of times over when addresses relate >> to location geography? 16 bytes is 2^128 = 340,282,366,920,938,463,463,374,607,431,768,211,456. This is a humorously large address space. Taking a SWAG at the size of the Earth, about 201,062,400 square miles, this comes to around 1,692,421,690,584,308,470,720,406,239,216 addresses per square mile of Earth s surface, or about 421,578,297,421,497,485,189 addresses per square inch. Even if we chop off three bytes to indicate galaxy, solar system and planet, we d still have 25,128,024 addresses per square *mil* here on Earth. Pathology will never be the same after every microbe has its own address... c T.Besançon (v ) Administration UNIX ARS Partie / Protocole IP 2.29 Annexe 1 Chapitre 2 Protocole IP 2.29 Annexe 1 Ci joint dans la version imprimée de ce cours, un diagramme d états de TCP/IP. c T.Besançon (v ) Administration UNIX ARS Partie / 468

79

80 2 Protocole IP 2.30 Annexe 2 Chapitre 2 Protocole IP 2.30 Annexe 2 Ci joint dans la version imprimée de ce cours, un article intitué «Introduction à NETFLOW» du numéro 47 de Janvier/Février 2010 de MISC. Adresse web : « c T.Besançon (v ) Administration UNIX ARS Partie / 468

81

82

83

84

85

86 Chapitre 3 Routage IP par défaut Explications sur le pourquoi de ce chapitre Tout ce qu un administrateur système a jamais voulu savoir sur le routage? Eh non! Trop compliqué. Certainement beaucoup de doublons avec ce qui aura été vu avec l enseignement RESEAU. c T.Besançon (v ) Administration UNIX ARS Partie / Routage IP par défaut 3.1 Notion de routage Chapitre 3 Routage IP par défaut 3.1 Notion de routage Routage : acheminement des paquets IP à leur destination selon un chemin déterminé par la destination Cas le plus simple : routage par défaut : tous les paquets IP sont acheminés vers un routeur qui les transmet plus loin. Les autres cas : protocoles spécialisés : ROUTED, BGP, OSPF, etc. (voir cours réseau) c T.Besançon (v ) Administration UNIX ARS Partie / 468

87 3 Routage IP par défaut 3.2 Relation entre routage IP et paquets ethernet Chapitre 3 Routage IP par défaut 3.2 Relation entre routage IP et paquets ethernet ETH = 11:22:33:44:55:66 IP = machine A E T ETH src = 11:22:33:44:55:66 ETH dest = AA:BB:CC:11:11:11 H I IP src = P IP dest = I P IP src = IP dest = IP data ROUTEUR ETH = AA:BB:CC:11:11:11 IP = ETH = AA:BB:CC:22:22:22 IP = machine B ETH = 77:88:99:00:AA:BB IP = E T ETH src = AA:BB:CC:22:22:22 ETH dest = 77:88:99:00:AA:BB H I IP src = P IP dest = c T.Besançon (v ) Administration UNIX ARS Partie / Routage IP par défaut 3.3 Routage par défaut Chapitre 3 Routage IP par défaut 3.3 Routage par défaut Routage par défaut : tous les paquets IP sont acheminés vers un routeur qui les transmet plus loin. La destination du routage par défaut : le réseau « » En utilisant la commande «netstat» (voir page 165) et l option «-n», on voit bien ce réseau spécial : % netstat -r Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt * U default UG % netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt U UG c T.Besançon (v ) Administration UNIX ARS Partie / 468

88 3 Routage IP par défaut 3.4 Configuration manuelle du routage LINUX : route Chapitre 3 Routage IP par défaut 3.4 Configuration manuelle du routage LINUX : route La commande «route» sert à configurer le routage. # route add default gw (en anglais gw = gateway = routeur) c T.Besançon (v ) Administration UNIX ARS Partie / Routage IP par défaut 3.5 Configuration automatique du routage LINUX : route Chapitre 3 Routage IP par défaut 3.5 Configuration automatique du routage LINUX : route Sur une machine LINUX, se reporter au fichier «/etc/sysconfig/network» : NETWORKING=yes FORWARD_IPV4=false HOSTNAME=server.example.com GATEWAY= c T.Besançon (v ) Administration UNIX ARS Partie / 468

89 3 Routage IP par défaut 3.6 Configuration du routage SOLARIS : route Chapitre 3 Routage IP par défaut 3.6 Configuration du routage SOLARIS : route La commande «route» sert à configurer le routage. # route add default Sur une machine SOLARIS, se reporter au fichier «/etc/defaultrouter» : c T.Besançon (v ) Administration UNIX ARS Partie / Routage IP par défaut 3.7 (Windows : : route.exe) Chapitre 3 Routage IP par défaut 3.7 (Windows : : route.exe) Sur WINDOWS, commande «route.exe» c T.Besançon (v ) Administration UNIX ARS Partie / 468

90 3 Routage IP par défaut 3.8 Routage : netstat Chapitre 3 Routage IP par défaut 3.8 Routage : netstat La commande «netstat -r» renvoie la table de routage d une machine UNIX. Préférer la commande «netstat -rn» qui renvoie la table de routage d une machine UNIX sous forme numérique (c est plus parlant) : % netstat -rn Routing Table: Destination Gateway Flags Ref Use Interface UGH U le U 3 0 le0 default UG UH lo0 c T.Besançon (v ) Administration UNIX ARS Partie / Routage IP par défaut 3.9 (Windows : : netstat.exe) Chapitre 3 Routage IP par défaut 3.9 (Windows : : netstat.exe) Sur WINDOWS, commande «netstat.exe» Même principe sur UNIX : «netstat.exe -rn» c T.Besançon (v ) Administration UNIX ARS Partie / 468

91 3 Routage IP par défaut 3.10 Test de connectivité : ping Chapitre 3 Routage IP par défaut 3.10 Test de connectivité : ping Pour vérifier le bon routage : commande «ping» FPING : « HPING : « commande «traceroute» Commandes aussi disponibles sur WINDOWS. c T.Besançon (v ) Administration UNIX ARS Partie / 468

92 Chapitre 4 Domain Name Server (DNS) Explications sur le pourquoi de ce chapitre Tout ce qu un administrateur système a jamais voulu savoir sur le DNS. Certainement beaucoup de doublons avec ce qui aura été vu avec l enseignement RESEAU. c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.1 Principes du DNS Chapitre 4 Domain Name Server (DNS) 4.1 Principes du DNS Impossibilité pratique de maintenir à jour les fichiers «/etc/hosts» des machines de tout un réseau d entreprises. remplacement par un mécanisme d annuaire réparti dont chacun gère sa entrée propre : le Domain Name Server Particularités de la base de données du DNS : répartie petite avec une faible fréquence de changements des données hiérarchisée accès en consultation uniquement ; pas de requête de modification c T.Besançon (v ) Administration UNIX ARS Partie / 468

93 4 Domain Name Server (DNS) 4.2 Zone DNS Chapitre 4 Domain Name Server (DNS) 4.2 Zone DNS zone DNS : reflet de l aspect réparti et hiérarchisé du DNS partie contigüe de l arbre une zone parente délègue une zone fille à un ou plusieurs serveurs d informations (nameservers) sur la zone fille. com net fr... wanadoo jussieu formation... www www ssh c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.3 Requête d interrogation du DNS Chapitre 4 Domain Name Server (DNS) 4.3 Requête d interrogation du DNS Le DNS est bâti selon un modèle client serveur Cf logiciel DNSTRACER sur « c T.Besançon (v ) Administration UNIX ARS Partie / 468

94 4 Domain Name Server (DNS) 4.3 Requête d interrogation du DNS Le DNS utilise des root nameservers (serveurs de la racine) : Pour assurer un service fiable, une zone est servie par un nameserver primaire et plusieurs nameservers secondaires de secours qui se synchronisent entre eux. c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.3 Requête d interrogation du DNS Principe de mémorisation des informations passées recueillies pour accélérer les réponses aux requêtes. L information a une date de péremption (TTL = Time To Live). Le serveur qui mémorise un record DNS n a pas autorité dessus. Chaque enregistrement de la base de données a : une classe ; la plus courante : IN (Internet) un type : A, PTR, NS, SOA, MX, CNAME,... et est donc de la forme : (classe, type, clé, valeur, TTL) Une requête ressemble alors à : (classe, type, clé,?,?) (classe, *, clé,?,?) c T.Besançon (v ) Administration UNIX ARS Partie / 468

95 4 Domain Name Server (DNS) 4.4 Implémentation : BIND, named Chapitre 4 Domain Name Server (DNS) 4.4 Implémentation : BIND, named URL : « Démon «named» Fichier de configuration «/etc/named.conf» (en général). Directory «/etc/namedb» stockant les fichiers de zone (en général). c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.4 Implémentation : BIND, named Exemple 1 : Pour connaitre la version de «named» : % dig ns.example.com version.bind chaos txt ; <<>> DiG 8.2 <<>> ns.example.com version.bind chaos txt ;; res options: init recurs defnam dnsrch ;; got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUERY SECTION: ;; version.bind, type = TXT, class = CHAOS ;; ANSWER SECTION: version.bind. 0S CHAOS TXT "bind 9" ;; Total query time: 3 msec ;; FROM: client.example.com to SERVER: default ;; WHEN: Mon Sep 30 00:20: ;; MSG SIZE sent: 30 rcvd: 49 c T.Besançon (v ) Administration UNIX ARS Partie / 468

96 4 Domain Name Server (DNS) 4.4 Implémentation : BIND, named Exemple 2 : Pour connaitre la version de «named» : % dig dmi.ens.fr version.bind chaos txt ; <<>> DiG <<>> dmi.ens.fr version.bind chaos txt ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 4 ;; QUESTION SECTION: ;dmi.ens.fr. IN A ;; ANSWER SECTION: dmi.ens.fr IN A ;; AUTHORITY SECTION: ens.fr IN NS oseille.ens.fr. ens.fr IN NS dmi.ens.fr. ens.fr IN NS ext.lri.fr. ens.fr IN NS ns2.nic.fr. ens.fr IN NS clipper.ens.fr. c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.4 Implémentation : BIND, named ;; ADDITIONAL SECTION: ext.lri.fr IN A ns2.nic.fr IN A clipper.ens.fr IN A oseille.ens.fr IN A ;; Query time: 831 msec ;; SERVER: #53( ) ;; WHEN: Mon Sep 30 00:19: ;; MSG SIZE rcvd: 210 ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;version.bind. CH TXT ;; ANSWER SECTION: version.bind. 0 CH TXT "9.2.1" ;; Query time: 42 msec ;; SERVER: #53( ) ;; WHEN: Mon Sep 30 00:19: ;; MSG SIZE rcvd: 48 c T.Besançon (v ) Administration UNIX ARS Partie / 468

97 4 Domain Name Server (DNS) 4.5 F.root-servers.net (vieille version) Chapitre 4 Domain Name Server (DNS) 4.5 F.root-servers.net (vieille version) OBSOLÈTE mais laissé pour se faire une idée c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.5 F.root-servers.net (vieille version) (cf « The Internet Software Consortium is proud to operate one of 13 root DNS servers as a public service to the Internet. The ISC has operated «F.root-servers.net» for the IANA ( since F ( answers more than 272 million DNS queries per day, making it one of the busiest DNS servers in the world. In fact, it is often the busiest root nameserver on the Internet. F is a virtual server made up of multiple (currently two) HP AlphaServers, donated to us by HP s Western Research Laboratory ( Each server is a HP ES40 AlphaServer with 4 500mhz CPUs and 8Gig of RAM, and runs ISC BIND as its DNS server. c T.Besançon (v ) Administration UNIX ARS Partie / 468

98 4 Domain Name Server (DNS) 4.5 F.root-servers.net (vieille version) The servers are hosted at PAIX.net, Inc. ( in Palo Alto, California and are connected to the Internet via fdx Fast Ethernet connections which are provided by UUNET ( Teleglobe ( and MFN ( For more information on the root DNS system, see : BCP 40 (RFC2870) - Operational guidelines for Root Name Servers ( c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.6 F.root-servers.net (à jour) Chapitre 4 Domain Name Server (DNS) 4.6 F.root-servers.net (à jour) (cf « 2 nœuds globaux, plus de trente nœuds locaux répartis dans divers pays. c T.Besançon (v ) Administration UNIX ARS Partie / 468

99 4 Domain Name Server (DNS) 4.7 Utilitaire rndc Chapitre 4 Domain Name Server (DNS) 4.7 Utilitaire rndc Syntaxe : rndc [options] cmd Il contrôle le fonctionnement de «named» à distance via TCP («rndc.conf» contient des clefs d accès). «status» status de NAMED «dumpdb» dumpe la base et le cache dans «/var/tmp/named_dump.db» «reload» recharge les zones primaires et secondaires «stats» dumpe les statistiques dans «/var/tmp/named.stats» «trace/notrace» gestion du niveau de trace dans «/var/tmp/named.run» «start» démarre NAMED «stop» arrête NAMED en sauvant les mises à jour en cours «halt» arrête NAMED froidement «restart» arrête et redémarre NAMED c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.8 Fichier /etc/resolv.conf Chapitre 4 Domain Name Server (DNS) 4.8 Fichier /etc/resolv.conf Consultation des nameservers indiqués via le fichier «/etc/resolv.conf» Exemple de fichier «/etc/resolv.conf» : domain formation.jussieu.fr search formation.jussieu.fr jussieu.fr nameserver nameserver Attention : Au plus 3 lignes «nameserver». Tous les UNIX ne comprennent pas la directive «search». Un commentaire commence par «;». c T.Besançon (v ) Administration UNIX ARS Partie / 468

100 4 Domain Name Server (DNS) 4.9 Interrogation manuelle DNS : nslookup Chapitre 4 Domain Name Server (DNS) 4.9 Interrogation manuelle DNS : nslookup Syntaxe : nslookup [options] nom-à-résoudre % nslookup Server: sunars1.formation.jussieu.fr Address: Non-authoritative answer: Name: Address: La machine est dans le cache du DNS parce qu elle a déjà été résolue dans un passé récent (voir ligne «Non-authoritative answer»). c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.10 Interrogation manuelle DNS : dig Chapitre 4 Domain Name Server (DNS) 4.10 Interrogation manuelle DNS : dig C est le remplaçant de «nslookup». Il est très low level. Syntaxe : dig [options] à-résoudre Quelques flags utilisés : flag «QR» : Query flag «AA» : Authoritative Answer flag «TC» : TCP flag «RD» : Recursion Desired flag «RA» : Recursion Available flag «AD» : Authentic Data (DNSSEC) flag «CD» : Checking Disabled (DNSSEC) ID QR Opcode AA TC RD RA Z AD CD RCODE QDCOUNT ANCOUNT NSCOUNT ARCOUNT c T.Besançon (v ) Administration UNIX ARS Partie / 468

101 4 Domain Name Server (DNS) 4.10 Interrogation manuelle DNS : dig Exemple 1 : récursivité jusqu à la racine % dig +trace ; <<>> DiG <<>> +trace ;; global options: printcmd IN NS K.ROOT-SERVERS.NET IN NS L.ROOT-SERVERS.NET IN NS M.ROOT-SERVERS.NET IN NS A.ROOT-SERVERS.NET IN NS B.ROOT-SERVERS.NET IN NS C.ROOT-SERVERS.NET IN NS D.ROOT-SERVERS.NET IN NS E.ROOT-SERVERS.NET IN NS F.ROOT-SERVERS.NET IN NS G.ROOT-SERVERS.NET IN NS H.ROOT-SERVERS.NET IN NS I.ROOT-SERVERS.NET IN NS J.ROOT-SERVERS.NET. ;; Received 244 bytes from #53( ) in 5 ms fr IN NS DNS.CS.WISC.EDU. fr IN NS NS1.NIC.fr. fr IN NS NS3.NIC.fr. fr IN NS DNS.INRIA.fr. c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.10 Interrogation manuelle DNS : dig fr IN NS NS2.NIC.fr. fr IN NS DNS.PRINCETON.EDU. fr IN NS NS-EXT.VIX.COM. fr IN NS NS3.DOMAIN-REGISTRY.NL. ;; Received 373 bytes from #53(K.ROOT-SERVERS.NET) in 273 ms jussieu.fr IN NS shiva.jussieu.fr. jussieu.fr IN NS cendrillon.lptl.jussieu.fr. jussieu.fr IN NS soleil.uvsq.fr. ;; Received 166 bytes from #53(DNS.CS.WISC.EDU) in 337 ms IN CNAME serveur.formation.jussieu.fr. serveur.formation.jussieu.fr IN A formation.jussieu.fr IN NS cendrillon.lptl.jussieu.fr. formation.jussieu.fr IN NS shiva.jussieu.fr. formation.jussieu.fr IN NS soleil.uvsq.fr. ;; Received 204 bytes from #53(shiva.jussieu.fr) in 217 ms On voit bien le mécanisme de consultations des différents nameservers. c T.Besançon (v ) Administration UNIX ARS Partie / 468

102 4 Domain Name Server (DNS) 4.10 Interrogation manuelle DNS : dig Exemple 2 : consultation à la nslookup % dig ; <<>> DiG <<>> ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 3, ADDITIONAL: 3 ;; QUESTION SECTION: ; IN A ;; ANSWER SECTION: IN CNAME serveur.formation.jussieu.fr. serveur.formation.jussieu.fr IN A ;; AUTHORITY SECTION: formation.jussieu.fr IN NS cendrillon.lptl.jussieu.fr. formation.jussieu.fr IN NS shiva.jussieu.fr. formation.jussieu.fr IN NS soleil.uvsq.fr. c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.10 Interrogation manuelle DNS : dig ;; ADDITIONAL SECTION: shiva.jussieu.fr IN A soleil.uvsq.fr IN A cendrillon.lptl.jussieu.fr IN A ;; Query time: 5 msec ;; SERVER: #53( ) ;; WHEN: Thu Aug 29 00:22: ;; MSG SIZE rcvd: 204 c T.Besançon (v ) Administration UNIX ARS Partie / 468

103 4 Domain Name Server (DNS) 4.10 Interrogation manuelle DNS : dig Exemple 3 : réponse en cas d erreur % dig cerise ; <<>> DiG <<>> cerise ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;cerise. IN A ;; AUTHORITY SECTION: IN SOA A.ROOT-SERVERS.NET. NSTLD.VERISIGN-GRS.COM ;; Query time: 206 msec ;; SERVER: #53( ) ;; WHEN: Thu Aug 29 00:22: ;; MSG SIZE rcvd: 98 c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.10 Interrogation manuelle DNS : dig Exemple 4 : précision du type du record Première fois : % dig SOA ; <<>> DiG 8.2 <<>> SOA ;; res options: init recurs defnam dnsrch ;; got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUERY SECTION: ;; type = SOA, class = IN ;; AUTHORITY SECTION: crlv.org. 2H IN SOA ns.easynet.fr. hostmaster.easynet.fr. ( ; serial 1H ; refresh 30M ; retry 4W ; expiry 2H ) ; minimum ;; Total query time: 14 msec ;; FROM: apollinaire.paris4.sorbonne.fr to SERVER: default ;; WHEN: Thu Aug 29 15:06: ;; MSG SIZE sent: 30 rcvd: 90 c T.Besançon (v ) Administration UNIX ARS Partie / 468

104 4 Domain Name Server (DNS) 4.10 Interrogation manuelle DNS : dig Deuxième fois : % dig SOA ; <<>> DiG 8.2 <<>> SOA ;; res options: init recurs defnam dnsrch ;; got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUERY SECTION: ;; type = SOA, class = IN ;; AUTHORITY SECTION: crlv.org. 1h55m43s IN SOA ns.easynet.fr. hostmaster.easynet.fr. ( ; serial 1H ; refresh 30M ; retry 4W ; expiry 2H ) ; minimum ;; Total query time: 13 msec ;; FROM: apollinaire.paris4.sorbonne.fr to SERVER: default ;; WHEN: Thu Aug 29 15:10: ;; MSG SIZE sent: 30 rcvd: 98 c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.11 Record de type PTR Chapitre 4 Domain Name Server (DNS) 4.11 Record de type PTR On peut aussi interroger un nameserver pour résoudre des adresses : % /usr/sbin/nslookup Server: sunars1.formation.jussieu.fr Address: Name: Address: A rapprocher de : % /usr/sbin/nslookup -query=ptr in-addr.arpa Server: sunars1.formation.jussieu.fr Address: Non-authoritative answer: in-addr.arpa name = Authoritative answers can be found from: in-addr.arpa nameserver = ns1.fth.net in-addr.arpa nameserver = ns2.fth.net ns1.fth.net internet address = ns2.fth.net internet address = c T.Besançon (v ) Administration UNIX ARS Partie / 468

105 4 Domain Name Server (DNS) 4.11 Record de type PTR. fr arpa jussieu in addr formation 134 sunars Assurer la coherence 1 sunars1.formation.jussieu.fr Une faute courante : oublier de mettre à jour l entrée relative à l adresse IP de la machine. c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.12 Fichier /etc/nsswitch.conf Chapitre 4 Domain Name Server (DNS) 4.12 Fichier /etc/nsswitch.conf Certains systèmes UNIX permettent de spécifier quelles méthodes de résolution utiliser (DNS, «/etc/hosts», NIS) ainsi que l ordre d enchaînement des méthodes. Sur Linux et Solaris, cf «/etc/nsswitch.conf» (voir page 408) :... hosts:... files nisplus nis dns ou... hosts:... xfn nisplus dns [NOTFOUND=return] files c T.Besançon (v ) Administration UNIX ARS Partie / 468

106 4 Domain Name Server (DNS) 4.13 Délégation d une partie de classe C Chapitre 4 Domain Name Server (DNS) 4.13 Délégation d une partie de classe C RFC 2317 « Avis : Mécanisme astucieux mais un peu compliqué à mettre en œuvre en pratique. c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.13 Délégation d une partie de classe C WebDNS Logiciel WebDNS : « Principe : générer les données via une vraie base de données avec toutes les possibilités fines associées (par exemple une personne peut avoir le droit SQL de modifier un et un seul record DNS dans la base SQL) Logiciel non réservé aux sous classes C. En utilisation sur le campus de Jussieu par exemple. Avis : Approche très tendance pour résoudre un problème de fond dans le principe du DNS lors de vrais déployements. Avis 2 : Nouvel exemple de couplage à une base de données. c T.Besançon (v ) Administration UNIX ARS Partie / 468

107 4 Domain Name Server (DNS) 4.13 Délégation d une partie de classe C Serveur Web (client PostgreSQL) HTTP/HTTPS Internet PostgreSQL DNS PostgreSQL named.boot Base de données Serveur de données (PostgreSQL) Serveur DNS (client PostgreSQL) fichiers de zones c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.14 Nom de machine : hostname Chapitre 4 Domain Name Server (DNS) 4.14 Nom de machine : hostname Caractères autorisés : cf RFC 952 et RFC 1123 En résumé : lettres majuscules lettres minuscules chiffres caractère moins «-» caractère underscore «_» « « c T.Besançon (v ) Administration UNIX ARS Partie / 468

108 4 Domain Name Server (DNS) 4.15 WHOIS Chapitre 4 Domain Name Server (DNS) 4.15 WHOIS WHOIS base de données des informations relatives à l attribution des plages d adresses IP et des noms de domaines. Exemple d un protocole Internet loupé car les implémentations ne sont pas compatibles entre elles. RFC 954, port TCP 43 Protocole exploitable par la commande «whois». Syntaxe : «whois [ -h server-whois ] adresse-ou-domaine» c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.15 WHOIS Quelques serveurs WHOIS : «rs.internic.net» «whois.nic.fr» ou via un interface WWW : « «whois.ripe.net» c T.Besançon (v ) Administration UNIX ARS Partie / 468

109 4 Domain Name Server (DNS) 4.16 (Windows : : ipconfig /displaydns, ipconfig /flushdns) Chapitre 4 Domain Name Server (DNS) 4.16 (Windows : : ipconfig /displaydns, ipconfig /flushdns) Une machine WINDOWS utilise un cache interne pour les requêtes DNS. On peut afficher le cache interne par la commande «ipconfig /displaydns». On peut purger le cache interne par la commande «ipconfig /flushdns». c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.17 Espace de confiance Chapitre 4 Domain Name Server (DNS) 4.17 Espace de confiance c T.Besançon (v ) Administration UNIX ARS Partie / 468

110 4 Domain Name Server (DNS) 4.18 Un peu de documentation Chapitre 4 Domain Name Server (DNS) 4.18 Un peu de documentation cours réseau ARS Cf « et ftp://ftp.isc.org/isc/bind/ Cf « Cf « Cf «ftp://ftp.univ-rennes1.fr/pub/reseau/dns/exemple/» RFC 2317 «Classless IN-ADDR.ARPA delegation» « «ftp://ftp.jussieu.fr/jussieu/doc/local/dnsmail.ps.z» « DNS and BIND, 4th Edition, By Paul Albitz & Cricket Liu, 4th Edition April 2001, O Reilly & Associates, Inc. 622 pages, $44.95 c T.Besançon (v ) Administration UNIX ARS Partie / Domain Name Server (DNS) 4.19 Annexe 1 Chapitre 4 Domain Name Server (DNS) 4.19 Annexe 1 Ci joint dans la version imprimée de ce cours, la liste des top level domains DNS. c T.Besançon (v ) Administration UNIX ARS Partie / 468

111 Generic top level domains http :// Generic top level domains (gtld) «.aero» Aviation «.asia» Asia «.biz» Business Organizations «.cat» Catalan language and culture «.com» Commercial «.coop» Co-Operative Organizations «.edu» Education «.gov» US Government «.info» Open TLD «.int» International Organizations «.jobs» Jobs «.mil» US Department of Defense «.mobi» Mobile devices «.museum» Museums «.name» Personal «.net» Networks «.org» Organizations «.pro» Credentialed professionals and related entities «.tel» Publishing of contact data «.travel» Travelling Country code top level domains (cctld) A «.ac» Ascension Island «.ad» Andorra «.ae» United Arab Emirates «.af» Afghanistan «.ag» Antigua and Barbuda «.ai» Anguilla «.al» Albania «.am» Armenia «.an» Netherlands Antilles «.ao» Angola «.aq» Antarctica «.ar» Argentina «.as» American Samoa «.at» Austria «.au» Australia «.aw» Aruba «.ax» Åland Islands «.az» Azerbaijan B «.ba» Bosnia and Herzegovina «.bb» Barbados «.bd» Bangladesh «.be» Belgium «.bf» Burkina Faso «.bg» Bulgaria «.bh» Bahrain «.bi» Burundi «.bj» Benin «.bm» Bermuda «.bn» Brunei Darussalam «.bo» Bolivia «.br» Brazil «.bs» Bahamas «.bt» Bhutan «.bv» Bouvet Island «.bw» Botswana «.by» Belarus «.bz» Belize C «.ca» Canada «.cc» Cocos (Keeling) Islands «.cd» Congo, Democratic republic of the (former Zaire) «.cf» Central African Republic «.cg» Congo, Republic of «.ch» Switzerland «.ci» Côte d Ivoire 1

112 «.ck» Cook Islands «.cl» Chile «.cm» Cameroon «.cn» China «.co» Colombia «.cr» Costa Rica «.cs» Czechoslovakia (former? non-existing) «.cu» Cuba «.cv» Cape Verde «.cx» Christmas Island «.cy» Cyprus «.cz» Czech Republic D «.de» Germany «.dj» Djibouti «.dk» Denmark «.dm» Dominica «.do» Dominican Republic «.dz» Algeria E «.ec» Ecuador «.ee» Estonia «.eg» Egypt «.eh» Western Sahara «.er» Eritrea «.es» Spain «.et» Ethiopia «.eu» European Union F «.fi» Finland «.fj» Fiji «.fk» Falkland Islands «.fm» Micronesia «.fo» Faroe Islands «.fr» France G «.ga» Gabon «.gb» United Kingdom «.gd» Grenada «.ge» Georgia «.gf» French Guiana «.gg» Guernsey «.gh» Ghana «.gi» Gibraltar «.gl» Greenland «.gm» Gambia «.gn» Guinea «.gp» Guadeloupe «.gq» Equatorial Guinea «.gr» Greece «.gs» South Georgia and the South Sandwich Islands «.gt» Guatemala «.gu» Guam «.gw» Guinea-Bissau «.gy» Guyana H «.hk» Hong Kong «.hm» Heard and McDonald Islands «.hn» Honduras «.hr» Croatia «.ht» Haiti «.hu» Hungary I «.id» Indonesia «.ie» Ireland «.il» Israel «.im» Isle of Man «.in» India «.io» British Indian Ocean Territory «.iq» Iraq «.ir» Iran «.is» Iceland «.it» Italia J «.je» Jersey «.jm» Jamaica «.jo» Jordan «.jp» Japan K «.ke» Kenya «.kg» Kyrgyzstan «.kh» Cambodia «.ki» Kiribati «.km» Comoros «.kn» Saint Kitts and Nevis «.kp» Korea, Democratic Peoples Republic of «.kr» Korea, Republic of «.kw» Kuwait «.ky» Cayman Islands «.kz» Kazakhstan L «.la» Lao People s Democratic Republic «.lb» Lebanon «.lc» Saint Lucia «.li» Liechtenstein «.lk» Sri Lanka «.lr» Liberia «.ls» Lesotho 2

113 «.lt» Lithuania «.lu» Luxembourg «.lv» Latvia «.ly» Libyan Arab Jamahiriya M «.ma» Morocco «.mc» Monaco «.md» Moldova «.me» Montenegro «.mg» Madagascar «.mh» Marshall Islands «.mk» Macedonia «.ml» Mali «.mm» Myanmar «.mn» Mongolia «.mo» Macau «.mp» Northern Mariana Islands «.mq» Martinique «.mr» Mauritania «.ms» Montserrat «.mt» Malta «.mu» Mauritius «.mv» Maldives «.mw» Malawi «.mx» Mexico «.my» Malaysia «.mz» Mozambique N «.na» Namibia «.nc» New Caledonia «.ne» Niger «.nf» Norfolk Island «.ng» Nigeria «.ni» Nicaragua «.nl» The Netherlands «.no» Norway «.np» Nepal «.nr» Nauru «.nu» Niue «.nz» New Zealand O «.om» Oman P «.pa» Panama «.pe» Peru «.pf» French Polynesia «.pg» Papua New Guinea «.ph» Philippines «.pk» Pakistan «.pl» Poland «.pm» St. Pierre and Miquelon «.pn» Pitcairn «.pr» Puerto Rico «.ps» Palestine «.pt» Portugal «.pw» Palau «.py» Paraguay Q «.qa» Qatar R «.re» Reunion «.ro» Romania «.rs» Serbia «.ru» Russia «.rw» Rwanda S «.sa» Saudi Arabia «.sb» Solomon Islands «.sc» Seychelles «.sd» Sudan «.se» Sweden «.sg» Singapore «.sh» St. Helena «.si» Slovenia «.sj» Svalbard and Jan Mayen Islands «.sk» Slovakia «.sl» Sierra Leone «.sm» San Marino «.sn» Senegal «.so» Somalia «.sr» Surinam «.st» Sao Tome and Principe «.su» USSR (former) «.sv» El Salvador «.sy» Syrian Arab Republic «.sz» Swaziland T «.tc» The Turks and Caicos Islands «.td» Chad «.tf» French Southern Territories «.tg» Togo «.th» Thailand «.tj» Tajikistan «.tk» Tokelau «.tl» Timor-Leste «.tm» Turkmenistan «.tn» Tunisia «.to» Tonga «.tp» East Timor «.tr» Turkey 3

114 «.tt» Trinidad and Tobago «.tv» Tuvalu «.tw» Taiwan «.tz» Tanzania U «.ua» Ukraine «.ug» Uganda «.uk» United Kingdom «.um» United States Minor Outlying Islands «.us» United States «.uy» Uruguay «.uz» Uzbekistan V «.va» Holy See (Vatican City State) «.vc» Saint Vincent and the Grenadines «.ve» Venezuela «.vg» Virgin Islands British «.vi» Virgin Islands U.S «.vn» Vietnam «.vu» Vanuatu W «.wf» Wallis and Futuna Islands «.ws» Samoa Y «.ye» Yemen «.yt» Mayotte «.yu» Yugoslavia Z «.za» South Africa «.zm» Zambia «.zr» Zaire (non-existent, see Congo) «.zw» Zimbabwe 4

115 4 Domain Name Server (DNS) 4.20 Annexe 2 Chapitre 4 Domain Name Server (DNS) 4.20 Annexe 2 Ci joint dans la version imprimée de ce cours, la RFC 2870 sur les contraintes pour un serveur de noms de la racine. c T.Besançon (v ) Administration UNIX ARS Partie / 468

116 Network Working Group Request for Comments: 2870 Obsoletes: 2010 BCP: 40 Category: Best Current Practice R. Bush Verio D. Karrenberg RIPE NCC M. Kosters Network Solutions R. Plzak SAIC June 2000 Status of this Memo Root Name Server Operational Requirements This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2000). All Rights Reserved. Abstract As the internet becomes increasingly critical to the world s social and economic infrastructure, attention has rightly focused on the correct, safe, reliable, and secure operation of the internet infrastructure itself. The root domain name servers are seen as a crucial part of that technical infrastructure. The primary focus of this document is to provide guidelines for operation of the root name servers. Other major zone server operators (gtlds, cctlds, major zones) may also find it useful. These guidelines are intended to meet the perceived societal needs without overly prescribing technical details. 1. Background The resolution of domain names on the internet is critically dependent on the proper, safe, and secure operation of the root domain name servers. Currently, these dozen or so servers are provided and operated by a very competent and trusted group of volunteers. This document does not propose to change that, but merely to provide formal guidelines so that the community understands how and why this is done. Bush, et al. Best Current Practice [Page 1] ^L

117 RFC 2870 Root Name Server Operational Requirements June The Internet Corporation for Assigned Names and Numbers (ICANN) has become responsible for the operation of the root servers. The ICANN has appointed a Root Server System Advisory Committee (RSSAC) to give technical and operational advice to the ICANN board. The ICANN and the RSSAC look to the IETF to provide engineering standards. 1.2 The root servers serve the root, aka ".", zone. Although today some of the root servers also serve some TLDs (top level domains) such as gtlds (COM, NET, ORG, etc.), infrastructural TLDs such as INT and IN ADDR.ARPA, and some cctlds (country code TLDs, e.g. SE for Sweden), this is likely to change (see 2.5). 1.3 The root servers are neither involved with nor dependent upon the whois data. 1.4 The domain name system has proven to be sufficiently robust that we are confident that the, presumably temporary, loss of most of the root servers should not significantly affect operation of the internet. 1.5 Experience has shown that the internet is quite vulnerable to incorrect data in the root zone or TLDs. Hence authentication, validation, and security of these data are of great concern. 2. The Servers Themselves The following are requirements for the technical details of the root servers themselves: 2.1 It would be short sighted of this document to specify particular hardware, operating systems, or name serving software. Variations in these areas would actually add overall robustness. 2.2 Each server MUST run software which correctly implements the IETF standards for the DNS, currently [RFC1035] [RFC2181]. While there are no formal test suites for standards compliance, the maintainers of software used on root servers are expected to take all reasonable actions to conform to the IETF s then current documented expectations. 2.3 At any time, each server MUST be able to handle a load of requests for root data which is three times the measured peak of such requests on the most loaded server in then current normal conditions. This is usually expressed in requests per second. This is intended to ensure continued operation of root services should two thirds of the servers be taken out of operation, whether by intent, accident, or malice. Bush, et al. Best Current Practice [Page 2] ^L

118 RFC 2870 Root Name Server Operational Requirements June Each root server should have sufficient connectivity to the internet to support the bandwidth needs of the above requirement. Connectivity to the internet SHOULD be as diverse as possible. Root servers SHOULD have mechanisms in place to accept IP connectivity to the root server from any internet provider delivering connectivity at their own cost. 2.5 Servers MUST provide authoritative responses only from the zones they serve. The servers MUST disable recursive lookup, forwarding, or any other function that may allow them to provide cached answers. They also MUST NOT provide secondary service for any zones other than the root and root servers.net zones. These restrictions help prevent undue load on the root servers and reduce the chance of their caching incorrect data. 2.6 Root servers MUST answer queries from any internet host, i.e. may not block root name resolution from any valid IP address, except in the case of queries causing operational problems, in which case the blocking SHOULD last only as long as the problem, and be as specific as reasonably possible. 2.7 Root servers SHOULD NOT answer AXFR, or other zone transfer, queries from clients other than other root servers. This restriction is intended to, among other things, prevent unnecessary load on the root servers as advice has been heard such as "To avoid having a corruptible cache, make your server a stealth secondary for the root zone." The root servers MAY put the root zone up for ftp or other access on one or more less critical servers. 2.8 Servers MUST generate checksums when sending UDP datagrams and MUST verify checksums when receiving UDP datagrams containing a non zero checksum. 3. Security Considerations The servers need both physical and protocol security as well as unambiguous authentication of their responses. 3.1 Physical security MUST be ensured in a manner expected of data centers critical to a major enterprise Whether or not the overall site in which a root server is located has access control, the specific area in which the root server is located MUST have positive access control, i.e. the number of individuals permitted access to the area MUST be limited, controlled, and recorded. At a Bush, et al. Best Current Practice [Page 3] ^L

119 RFC 2870 Root Name Server Operational Requirements June 2000 minimum, control measures SHOULD be either mechanical or electronic locks. Physical security MAY be enhanced by the use of intrusion detection and motion sensors, multiple serial access points, security personnel, etc Unless there is documentable experience that the local power grid is more reliable than the MTBF of a UPS (i.e. five to ten years), power continuity for at least 48 hours MUST be assured, whether through on site batteries, on site power generation, or some combination thereof. This MUST supply the server itself, as well as the infrastructure necessary to connect the server to the internet. There MUST be procedures which ensure that power fallback mechanisms and supplies are tested no less frequently than the specifications and recommendations of the manufacturer Fire detection and/or retardation MUST be provided Provision MUST be made for rapid return to operation after a system outage. This SHOULD involve backup of systems software and configuration. But SHOULD also involve backup hardware which is pre configured and ready to take over operation, which MAY require manual procedures. 3.2 Network security should be of the level provided for critical infrastructure of a major commercial enterprise The root servers themselves MUST NOT provide services other than root name service e.g. remote internet protocols such as http, telnet, rlogin, ftp, etc. The only login accounts permitted should be for the server administrator(s). "Root" or "privileged user" access MUST NOT be permitted except through an intermediate user account. Servers MUST have a secure mechanism for remote administrative access and maintenance. Failures happen; given the 24x7 support requirement (per 4.5), there will be times when something breaks badly enough that senior wizards will have to connect remotely. Remote logins MUST be protected by a secure means that is strongly authenticated and encrypted, and sites from which remote login is allowed MUST be protected and hardened Root name servers SHOULD NOT trust other hosts, except secondary servers trusting the primary server, for matters of authentication, encryption keys, or other access or Bush, et al. Best Current Practice [Page 4] ^L

120 RFC 2870 Root Name Server Operational Requirements June 2000 security information. If a root operator uses kerberos authentication to manage access to the root server, then the associated kerberos key server MUST be protected with the same prudence as the root server itself. This applies to all related services which are trusted in any manner The LAN segment(s) on which a root server is homed MUST NOT also home crackable hosts. I.e. the LAN segments should be switched or routed so there is no possibility of masquerading. Some LAN switches aren t suitable for security purposes, there have been published attacks on their filtering. While these can often be prevented by careful configuration, extreme prudence is recommended. It is best if the LAN segment simply does not have any other hosts on it The LAN segment(s) on which a root server is homed SHOULD be separately firewalled or packet filtered to discourage network access to any port other than those needed for name service The root servers SHOULD have their clocks synchronized via NTP [RFC1305] [RFC2030] or similar mechanisms, in as secure manner as possible. For this purpose, servers and their associated firewalls SHOULD allow the root servers to be NTP clients. Root servers MUST NOT act as NTP peers or servers All attempts at intrusion or other compromise SHOULD be logged, and all such logs from all root servers SHOULD be analyzed by a cooperative security team communicating with all server operators to look for patterns, serious attempts, etc. Servers SHOULD log in GMT to facilitate log comparison Server logging SHOULD be to separate hosts which SHOULD be protected similarly to the root servers themselves The server SHOULD be protected from attacks based on source routing. The server MUST NOT rely on address or name based authentication The network on which the server is homed SHOULD have in addr.arpa service. 3.3 Protocol authentication and security are required to ensure that data presented by the root servers are those created by those authorized to maintain the root zone data. Bush, et al. Best Current Practice [Page 5] ^L

121 RFC 2870 Root Name Server Operational Requirements June The root zone MUST be signed by the Internet Assigned Numbers Authority (IANA) in accordance with DNSSEC, see [RFC2535] or its replacements. It is understood that DNSSEC is not yet deployable on some common platforms, but will be deployed when supported Root servers MUST be DNSSEC capable so that queries may be authenticated by clients with security and authentication concerns. It is understood that DNSSEC is not yet deployable on some common platforms, but will be deployed when supported Transfer of the root zone between root servers MUST be authenticated and be as secure as reasonably possible. Out of band security validation of updates MUST be supported. Servers MUST use DNSSEC to authenticate root zones received from other servers. It is understood that DNSSEC is not yet deployable on some common platforms, but will be deployed when supported A hidden primary server, which only allows access by the authorized secondary root servers, MAY be used Root zone updates SHOULD only progress after a number of heuristic checks designed to detect erroneous updates have been passed. In case the update fails the tests, human intervention MUST be requested Root zone updates SHOULD normally be effective no later than 6 hours from notification of the root server operator A special procedure for emergency updates SHOULD be defined. Updates initiated by the emergency procedure SHOULD be made no later than 12 hours after notification In the advent of a critical network failure, each root server MUST have a method to update the root zone data via a medium which is delivered through an alternative, non network, path Each root MUST keep global statistics on the amount and types of queries received/answered on a daily basis. These statistics must be made available to RSSAC and RSSAC sponsored researchers to help determine how to better deploy these machines more efficiently across the Bush, et al. Best Current Practice [Page 6] ^L

122 RFC 2870 Root Name Server Operational Requirements June Communications internet. Each root MAY collect data snapshots to help determine data points such as DNS query storms, significant implementation bugs, etc. Communications and coordination between root server operators and between the operators and the IANA and ICANN are necessary. 4.1 Planned outages and other down times SHOULD be coordinated between root server operators to ensure that a significant number of the root servers are not all down at the same time. Preannouncement of planned outages also keeps other operators from wasting time wondering about any anomalies. 4.2 Root server operators SHOULD coordinate backup timing so that many servers are not off line being backed up at the same time. Backups SHOULD be frequently transferred off site. 4.3 Root server operators SHOULD exchange log files, particularly as they relate to security, loading, and other significant events. This MAY be through a central log coordination point, or MAY be informal. 4.4 Statistics as they concern usage rates, loading, and resource utilization SHOULD be exchanged between operators, and MUST be reported to the IANA for planning and reporting purposes. 4.5 Root name server administrative personnel MUST be available to provide service 24 hours a day, 7 days per week. On call personnel MAY be used to provide this service outside of normal working hours. 5. Acknowledgements The authors would like to thank Scott Bradner, Robert Elz, Chris Fletcher, John Klensin, Steve Bellovin, and Vern Paxson for their constructive comments. Bush, et al. Best Current Practice [Page 7] ^L

123 RFC 2870 Root Name Server Operational Requirements June References [RFC1035] Mockapetris, P., "Domain names implementation and specification", STD 13, RFC 1035, November [RFC1305] Mills, D., "Network Time Protocol (Version 3) Specification, Implementation", RFC 1305, March [RFC2030] Mills, D., "Simple Network Time Protocol (SNTP) Version 4 for IPv4, IPv6 and OSI", RFC 2030, October [RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS Specification", RFC 2181, July [RFC2535] Eastlake, D. and C. Kaufman, "Domain Name System Security Extensions", RFC 2535, March Bush, et al. Best Current Practice [Page 8] ^L

124 RFC 2870 Root Name Server Operational Requirements June Authors Addresses Randy Bush Verio, Inc Crystal Springs Bainbridge Island, WA US Phone: Daniel Karrenberg RIPE Network Coordination Centre (NCC) Singel 258 NL 1016 AB Amsterdam Netherlands Phone: Mark Kosters Network Solutions 505 Huntmar Park Drive Herndon, VA Phone: Raymond Plzak SAIC 1710 Goodridge Drive McLean, Virginia Specification of Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC Bush, et al. Best Current Practice [Page 9] ^L

125 RFC 2870 Root Name Server Operational Requirements June Full Copyright Statement Copyright (C) The Internet Society (2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Bush, et al. Best Current Practice [Page 10]

126 4 Domain Name Server (DNS) 4.21 Annexe 3 Chapitre 4 Domain Name Server (DNS) 4.21 Annexe 3 Ci joint dans la version imprimée de ce cours, un article sur les nameservers de la racine. c T.Besançon (v ) Administration UNIX ARS Partie / 468

127 Combien y a t-il vraiment de serveurs DNS racine? Stéphane Bortzmeyer <[email protected]> Première rédaction de cet article le 27 Novembre Dernière mise à jour le 13 Janvier La question resurgit régulièrement dans les discussions sur la gouvernance de l Internet : combien le DNS a t-il de serveurs racine? Comme souvent avec les chiffres, tout dépend de ce qu on mesure. La marionette du gouvernement états-unien se contredit même dans sa propre propagande. En 2007, l ICANN criait bien fort qu il n y en avait pas que treize serveurs racine ( pour se moquer des critiques de son concurrent, l UIT. En 2009, dans les publications officielles de la même organisation, ils sont redevenus treize ( htm). En fait, les deux textes ont raison, car ils ne mesurent pas la même chose. Ce qui est drôle est que le texte de 2007 est très agressif, très arrogant, laissant entendre que les gens qui parlent des treize serveurs sont des ignorants, alors que la même organisation reprend ce compte deux ans après. Donc, si on veut les chiffres authentiques, il faut passer un peu de temps et mieux comprendre ce qu il y a derrière les chiffres. On peut trouver tous les détails sur le site (non officiel) ( root-servers.org/) de certains des opérateurs des serveurs racine. Le résultat : Il y a onze organisations qui gèrent un serveur racine (VeriSign, ISI, Cogent, université du Maryland, NASA, ISC, armée US, Autonomica, RIPE-NCC, ICANN et WIDE). Seulement deux sont européennes et une japonaise, toutes les autres sont états-uniennes. Changer cette liste est à peu près impossible pour des raisons politiciennes. Il n y a aucun processus pour recruter un gérant de serveur racine et aucun pour en licencier un, quelles que soient les choses étranges qu il fasse (http: // En l absence d autorité (au sens moral et politique du terme) qui puisse dire qu on va remplacer telle organisation, qui ne rend pas un service génial, par telle autre (les bons ne manquent pas), la liste n a jamais connu une seule modification depuis quinze ans, cas unique de stabilité dans l Internet. Le statu quo semble la seule solution. 1

128 2 Il y a treize noms dans la racine (treize enregistrements NS pour Name Server ), de A. root-servers.net à M.root-servers.net. Avec dig, la commande digns. vous les affichera. Déduire de ce nombre une insuffisance de la résistance de la racine aux pannes ( Il n y a que treize pauvres machines (et là on imagine un vieux serveur Dell sur son rack rack) ) serait donc erroné, ces treize serveurs ne sont pas treize machines. À noter que le nombre de treize vient de vieilles considérations sur l ancienne taille des paquets DNS, limitée à 512 octets autrefois. La limite a été étendue il y a dix ans et a été effectivement dépassée lors de l introduction des adresses IPv6 dans la racine en 2008 ( announcement-04feb08.htm) mais personne n ose prendre la responsabilité de remettre en cause ce nombre magique de treize. Il y a vingt et une adresses IP de serveurs de noms de la racine (certains n ont pas encore une adresse IPv6). Grâce à l anycast anycast, il y a cent quatre vingt neuf sites physiques différents où se trouve un serveur racine, comme celui de Prague, annoncé dans le communiqué de l ICANN ( icann.org/en/announcements/announcement-26oct09-en.htm) d octobre Un bon nombre de ces sites sont purement locaux, leurs annonces BGP ne sont pas propagées en dehors d un cercle limité et ces sites ne sont donc pas accessibles de l extérieur (par exemple, ils sont souvent limités aux opérateurs connectés à un même point d échange). Ce nombre varie souvent et dépend uniquement des décisions de chaque organisation gérant un serveur. Certaines, les plus dynamiques comme l ISC, ouvrent des sites à un rythme soutenu. Il y a un nombre inconnu de machines qui assurent ce service, certainement nettement plus de deux cents, la plupart des sites hébergeant plus qu une machine, derrière un répartiteur de charge. Si on va analyser la résistance de la racine aux pannes, le chiffre à prendre en considération dépend de la panne envisagée. Si c est la panne d un composant électronique dans un ordinateur, c est bien le nombre de machines physiques qui est le paramètre important. Si la panne est un incendie ou un tremblement de terre ( c est plutôt le nombre de sites qui compte car la panne affectera tous les serveurs situés sur le site. Enfin, si la panne est de nature organisationnelle (cas de certains serveurs racines où l organisation qui les héberge ne fait guère d efforts et ne déploie guère de moyens, voire connait des conflits internes), c est bien le chiffre de onze, le nombre d organisations, qu il faut prendre en compte pour évaluer la fiabilité de la racine. -

129 4 Domain Name Server (DNS) 4.22 Annexe 4 Chapitre 4 Domain Name Server (DNS) 4.22 Annexe 4 Ci joint dans la version imprimée de ce cours, la RFC 2606 sur les noms de domaine utilisables pour faire des tests, des maquettes, pour des machines non joignables depuis Internet. c T.Besançon (v ) Administration UNIX ARS Partie / 468

130 Network Working Group D. Eastlake Request for Comments: 2606 A. Panitz BCP: 32 June 1999 Category: Best Current Practice Status of this Memo Reserved Top Level DNS Names This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (1999). All Rights Reserved. Abstract To reduce the likelihood of conflict and confusion, a few top level domain names are reserved for use in private testing, as examples in documentation, and the like. In addition, a few second level domain names reserved for use as examples are documented. Table of Contents 1. Introduction TLDs for Testing, & Documentation Examples Reserved Example Second Level Domain Names IANA Considerations Security Considerations...3 References...3 Authors Addresses...4 Full Copyright Statement Introduction The global Internet Domain Name System is documented in [RFC 1034, 1035, 1591] and numerous additional Requests for Comment. It defines a tree of names starting with root, ".", immediately below which are top level domain names such as ".com" and ".us". Below top level domain names there are normally additional levels of names. Eastlake & Panitz Best Current Practice [Page 1] ^L

131 RFC 2606 Reserved Top Level DNS Names June TLDs for Testing, & Documentation Examples There is a need for top level domain (TLD) names that can be used for creating names which, without fear of conflicts with current or future actual TLD names in the global DNS, can be used for private testing of existing DNS related code, examples in documentation, DNS related experimentation, invalid DNS names, or other similar uses. For example, without guidance, a site might set up some local additional unused top level domains for testing of its local DNS code and configuration. Later, these TLDs might come into actual use on the global Internet. As a result, local attempts to reference the real data in these zones could be thwarted by the local test versions. Or test or example code might be written that accesses a TLD that is in use with the thought that the test code would only be run in a restricted testbed net or the example never actually run. Later, the test code could escape from the testbed or the example be actually coded and run on the Internet. Depending on the nature of the test or example, it might be best for it to be referencing a TLD permanently reserved for such purposes. To safely satisfy these needs, four domain names are reserved as listed and described below..test.example.invalid.localhost ".test" is recommended for use in testing of current or new DNS related code. ".example" is recommended for use in documentation or as examples. ".invalid" is intended for use in online construction of domain names that are sure to be invalid and which it is obvious at a glance are invalid. The ".localhost" TLD has traditionally been statically defined in host DNS implementations as having an A record pointing to the loop back IP address and is reserved for such use. Any other use would conflict with widely deployed code which assumes this use. 3. Reserved Example Second Level Domain Names The Internet Assigned Numbers Authority (IANA) also currently has the following second level domain names reserved which can be used as examples. Eastlake & Panitz Best Current Practice [Page 2] ^L

132 RFC 2606 Reserved Top Level DNS Names June 1999 example.com example.net example.org 4. IANA Considerations IANA has agreed to the four top level domain name reservations specified in this document and will reserve them for the uses indicated. 5. Security Considerations Confusion and conflict can be caused by the use of a current or future top level domain name in experimentation or testing, as an example in documentation, to indicate invalid names, or as a synonym for the loop back address. Test and experimental software can escape and end up being run against the global operational DNS. Even examples used "only" in documentation can end up being coded and released or cause conflicts due to later real use and the possible acquisition of intellectual property rights in such "example" names. The reservation of several top level domain names for these purposes will minimize such confusion and conflict. References [RFC 1034] Mockapetris, P., "Domain names concepts and facilities", STD 13, RFC 1034, November [RFC 1035] Mockapetris, P., "Domain names implementation and specification", STD 13, RFC 1035, November [RFC 1591] Postel, J., "Domain Name System Structure and Delegation", RFC 1591, March Eastlake & Panitz Best Current Practice [Page 3] ^L

133 RFC 2606 Reserved Top Level DNS Names June 1999 Authors Addresses Donald E. Eastlake 3rd IBM 65 Shindegan Hill Road, RR #1 Carmel, NY Phone: (h) (w) FAX: (3) Aliza R. Panitz 500 Stamford Dr. No. 310 Newark, DE USA Phone: Eastlake & Panitz Best Current Practice [Page 4] ^L

134 RFC 2606 Reserved Top Level DNS Names June 1999 Full Copyright Statement Copyright (C) The Internet Society (1999). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Eastlake & Panitz Best Current Practice [Page 5]

135 4 Domain Name Server (DNS) 4.23 Annexe 5 Chapitre 4 Domain Name Server (DNS) 4.23 Annexe 5 Ci joint dans la version imprimée de ce cours, un article intitué «Pourquoi le tld local n est pas une bonne idée». c T.Besançon (v ) Administration UNIX ARS Partie / 468

136 Pourquoi le domaine de tête.local n est pas une bonne idée Stéphane Bortzmeyer <[email protected]> Première rédaction de cet article le 19 Juillet html - Sur beaucoup de sites, les ressources réseaux internes ont des noms situées sous le pseudo-tld.local. Ce TLD (domaine de tête) n a pas été enregistré à cet usage et son utilisation peut apporter des mauvaises surprises. Il vaut mieux en effet utiliser un vrai nom de domaine par exemple grandeentreprise.fr ou petiteassociation.org. Si on veut séparer les ressources locales, purement internes, local.grandeentreprise. fr ou monsite.petiteassociation.org (qui tirent profit du caractère hiérarchique du DNS) conviennent également. À une époque lointaine, un nom de domaine en.com était gratuit (oui, la réservation de renault.com m avait coûté 0 e). Puis obtenir un nom de domaine était devenu très cher ou bien soumis à de pénible restrictions bureaucratiques. Mais le prix a été nettement abaissé par des acteurs comme Gandi ( chemla.org/textes/voleur.html) et les règles d enregistrement se sont souvent assouplies. Aujourd hui il est raisonnable de supposer que tout le monde a un nom de domaine, et peut l utiliser pour ses ressources internes comme pour les externes. Mais, au fait, pourquoi.local est-il une mauvaise idée? D abord, parce qu il n est nullement garanti. L ICANN pourrait le déléguer demain et ses utilisateurs seraient alors fort marris. Le RFC réserve quelques TLD à des fins de test ou de documentation et.local n en fait pas partie. Mais le problème de fond est que.local n est pas unique puisque des tas d entreprises l utilisent. Que se passera t-il en cas de fusion ou d acquisition? Si un utilisateur de.local absorbe un autre utilisateur, les conflits de noms seront fréquents (Et le cas s est souvent produit.) 1 Pour voir le RFC de numéro NNN, par exemple rfc/rfc2606.txt 1

137 2 Ce n est pas parce que ces ressources ne sont pas accessibles de l Internet qu il faut leur donner un nom qui n est pas unique. Certes, des géants états-uniens du logiciel comme Microsoft (avec son système Active Directory) ou bien Apple (avec Bonjour), utilisent ce pseudo-tld. Mais ils ne sont pas des exemples à suivre. Cette utilisation montre simplement leur peu d intérêt de la normalisation et leur tendance au «Je fais ce que je veux et tant pis pour l opinion des autres». Lors de la discussion du RFC 4795 sur LLMNR, Apple avait même tenté d obtenir la réservation de.local avec comme seul argument «Nous nous en sommes servis unilatéralement, désormais l IETF doit approuver ce choix.» -

138 4 Domain Name Server (DNS) 4.24 Annexe 6 Chapitre 4 Domain Name Server (DNS) 4.24 Annexe 6 Ci joint dans la version imprimée de ce cours, un article sur GOOGLE DNS. c T.Besançon (v ) Administration UNIX ARS Partie / 468

139 Tout le monde parle de Google DNS... Stéphane Bortzmeyer Première rédaction de cet article le 4 Décembre Dernière mise à jour le 8 Décembre Alors, je vais en faire autant. Après Google Mail, Google Docs, Google Talk, Google Wave, Google DNS ( est la dernière vedette de la blogosphère, en attendant Google Power (pour distribuer l électricité) et Google Airlines (gratuit, évidemment, pour battre les compagnies low cost low cost). Google DNS ( est un résolveur DNS ouvert, accessible à tous gratuitement. On peut l utiliser à la place des résolveurs fournis par le service informatique du réseau local, ou par le FAI. Les instructions pour cela sont disponibles chez Google (en gros, sur Unix, il suffit d éditer son /etc/resolv.conf). L adresse à indiquer, , sera certainement dans très peu de temps une des plus connues de l Internet. C était une idée marketing géniale que d utiliser une adresse simple à mémoriser (avec son alternative, ) même s il n est pas sûr que faire de l anycast anycast sur cette plage normalement allouée à Level 3 soit parfaitement conforme aux règles de l ARIN. Mais ne chipotons pas. Quel intérêt y a t-il à utiliser un résolveur DNS distinct du résolveur habituel qu on trouve sur n importe quel réseau? La seule raison valable, à mon avis, est le cas où ledit résolveur soit inexistant ou très lent (cela arrive avec certains FAI). Mais Google met en avant d autres raisons. En résumant : vitesse, sécurité et honnêteté. Commençons par la fin : contrairement à ses trois concurrents plus anciens (dont le plus connu, en raison de leur marketing agressif, est OpenDNS ( les résolveurs de Google ne sont en effet pas des menteurs ( html). Remarquons qu on est tombés très bas : ne pas mentir devient si rare que c est désormais cité comme argument commercial. % MX doesnotexist.fr... ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id:

140 2 On obtient bien le NXDOMAIN ( No Such Domain ), le DNS fonctionne normalement. Et la vitesse? A priori, l essentiel du temps de réponse étant dû à la latence jusqu au résolveur, Google DNS a peu de chances d être plus rapide. À l heure actuelle, Google DNS n a apparemment pas d instance en France et le serveur semble à Francfort. Mais mesurons, ne devinons pas. En utilisant qtest ( voyons, pour une machine française, trois résolveurs de son réseau local, les deux résolveurs d OpenDNS et les deux de Google DNS : % qtest -n3 "A a.gtld-servers.net" / / / Les résolveurs locaux gagnent nettement, comme prévu. Et si on compare les trois services de résolveur extérieurs : % qtest -n3 "A a.gtld-servers.net" / / / Google l emporte sur OpenDNS, non seulement en honnêteté mais aussi en vitesse. Il n y a donc vraiment aucune raison pratique d utiliser OpenDNS. (Il y a d autres mesures sérieuses, avec le même résultat, par exemple Google DNS vs OpenDNS : Google Rocks for International Users ( manu-j.com/blog/opendns-alternative-google-dns-rocks/403/) ou Divers Resolver DNS Services ( Un autre outil de mesure, certes écrit par Google mais dont le source est disponible, si vous n avez pas confiance, est l excellent Namebench ( Voyez aussi Domain Name Speed Benchmark ( (ce dernier étant spécifique à Windows). Et la sécurité? Google promet ( html) que ses résolveurs mettent en œuvre toutes les bonnes pratiques actuelles, ce qui est la moindre des choses. La seule stratégie qui aurait été différenciatrice aurait été de faire la validation DNSSEC mais Google DNS ne le fait pas. Par rapport aux résolveurs locaux utilisant les logiciels libres actuels, à jour, comme Unbound ou BIND, Google DNS n a qu un avantage, l utilisation de la variation de la casse, un hack amusant mais marginal. Parlant de sécurité, notons un petit problème : il n y a aucune authentification entre le client sur son poste de travail et Google DNS. Rien ne garantit qu on parle bien aux machines de Google. D habitude, cette sécurisation du dernier kilomètre n était pas un problème car le résolveur DNS était proche : sur le même réseau local ou en tout cas sur le même FAI. Avec Google DNS, cela cesse d être vrai et on pourrait imaginer de nombreux détournements possibles, par attaque sur le système de routage. Ces attaques pourraient être faites par un intermédiaire, ou bien par un FAI malhonnête, peu soucieux de voir ses clients partir chez Google. Pour s en protéger, il existe plusieurs solutions techniques mais aucune ne semble réaliste. Les seules solutions DNS (j exclus IPsec et compagnie) possibles sont TSIG (RFC ), 1. Pour voir le RFC de numéro NNN, par exemple rfc/rfc2845.txt -

141 3 qui repose sur un secret partagé, et est donc inutilisable pour un service public comme Google DNS, et SIG(0) (RFC 2931), que personne n a jamais déployé). Dans les deux cas, je ne crois pas qu aucun stub resolver existant (par exemple la GNU libc) ne le gère, ce qui les rend complètement irréalistes et explique pourquoi Google n offre pas ce service de sécurité. Bref, il n y a de raisons d utiliser un service de résolveurs externe que si le sien est dramatiquement défaillant. Mais, dans ce cas, quelles sont les conséquences? D abord, il y a un problème spécifique à Google : l existence d une offre très vaste couvrant à peu près tout les services Internet. Si malhonnête que soit OpenDNS, quoi qu ils fassent des données recueillies sur les utilisateurs, le fait qu ils ne gèrent qu un unique service limite les corréalations qu ils peuvent établir et donc le mal qu ils peuvent faire. Au contraire, Google, ayant une offre complète, peut établir des relations, mettre en connexion des données, et représente donc un danger potentiel plus important. Externaliser son courrier à Gmail (ou son DNS à Google DNS), est une chose. Externaliser tous ses services en est une autre. Cela revient à avoir la même entreprise qui serait à la fois votre banque, votre médecin, votre épicier et votre garagiste... Comment Google peut-il exploiter Google DNS, service gratuit? Ici, je spécule, je n ai pas d informations précises. Google peut gagner de l argent : en exploitant l information recueillie pour améliorer le moteur de recherche, en vendant cette information (les noms les plus populaires, par exemple, une information qui intéressera les domainers domainers, surtout si la réponse est NXDOMAIN, indiquant que le domaine est libre), en hébergeant, dans le futur (ce service n existe pas aujourd hui), moyennant finances, des serveurs faisant autorité, qui profiteront de la proximité du résolveur pour de meilleures performances. Google, futur hébergeur de TLD? Reconstituer la totalité d un TLD, même lorsque celui-ci ne publie pas cette information, en comptant les noms dans les requêtes et les réponses obtenues. Ou, tout simplement, s assurer que l Internet fonctionne bien, pour que les clients puissent aller voir les autres services de Google (chez certains FAI, les résolveurs DNS marchent mal, ce qui gêne sans doute Google dans son cœur de métier). -

142 4 Domain Name Server (DNS) 4.25 Annexe 7 Chapitre 4 Domain Name Server (DNS) 4.25 Annexe 7 Ci joint dans la version imprimée de ce cours, un article sur la panne du DNS de «.se». c T.Besançon (v ) Administration UNIX ARS Partie / 468

143 Un domaine de tête entier, le suédois, disparait temporairement Stéphane Bortzmeyer Première rédaction de cet article le 14 Octobre Lundi 12 octobre, vers 20h00 UTC, le domaine de tête.se a chargé la zone DNS de numéro de série qui comprenait une énorme erreur. Pendant une heure, plus aucun nom de domaine se terminant par.se ne fonctionnait. Très vite, Twitter a vu des tweets sur le sujet ( ), puis des rapports et des discussions ont commencé sur les listes de diffusion d opérateurs comme Nanog ( La cause immédiate était le manque d un point dans le fichier de zone, les enregistrement NS de la zone avaient tous un.se en trop à la fin, par exemple h.ns.se.se. En effet, dans le format standard des fichiers de zone DNS, tel qu il est défini en section 5 du RFC , un nom qui ne se termine pas par un point est complété par la nom de la zone, ici.se. Pendant la panne, on a donc pu voir : % dig +cd NS se.... ;; ANSWER SECTION: se IN NS h.ns.se.se. se IN NS g.ns.se.se Pour voir le RFC de numéro NNN, par exemple rfc/rfc1035.txt 1

144 2 Vous avez remarqué? (Moi, je ne l avais pas vu.) Un.se de trop à la fin, les noms des serveurs de noms étaient donc tous considérés comme inexistants..se avait donc disparu de l Internet, plus de Web, plus de courrier, plus de XMPP, etc. Comme quasiment toutes les interactions sur l Internet commencent par une requête DNS, plus rien ne marchait. Selon la façon dont les résolveurs remplaçaient la délégation venue de la racine (qui était correcte) par celle faisant autorité (car publiée par le domaine.se lui-même), ils arrivaient encore à résoudre les noms en.se ou pas (BIND se débrouillait mieux qu Unbound, l inverse aurait été vrai si l erreur avait été à la racine). Le problème ne concernait pas que les enregistrement NS du TLD mais aussi ceux de toutes les zones déléguées : ballou.se NS ns1.ballou.se.se NS ns2.ballou.se.se NS ns3.aname.se.se. Vers 21h00 UTC,.se a chargé la zone qui corrigeait l erreur... et en introduisait d autres, notamment des signatures DNSSEC invalides pour le SOA ( unbound-users/2009-october/ html). (Ce problème a été reconnu par le registre (http: // Tout a finalement été réparé mais la mauvaise information pouvait encore se trouver dans des caches. Pendant un certain temps, les sites en.se restaient injoignables, sauf à obtenir de votre FAI qu il redémarre son résolveur (comme conseillé par le registre ( felaktig-dns-information/), command rndcflush pour BIND). Pendant quelle durée exactement? Le TTL est de deux jours, donc j avais pensé que ce serait la durée de la panne (et c est aussi ce qu annonce le registre ( mais Jay Daley me fait remarquer à juste titre que, les noms n existant pas, c est le cache négatif (RFC 2308) qui compte et que celui-ci est de seulement deux heures pour.se. Cette panne est une des plus graves qui aient jamais affecté un domaine de tête sérieux. Aurait-elle pu être évitée? Il est évident qu il faut faire tourner des tests de validité ( net/pipermail/dns-operations/2009-october/ html) avant de publier la zone. Mais aucun test ne détecte tous les problèmes possibles. Par exemple, un outil de vérification livré avec BIND aurait pu détecter le problème : % named-checkzone example example.zone zone example/in: NS ns1.nic.example.example has no address records (A or AAAA) Mais named-checkzone a aussi des limites. Il ne positionne pas le code de retour dans le cas cidessus, par exemple (et, non, -nfail ne change rien). Et il ne marche pas si la zone est mise à jour par dynamic update (RFC 2136). Quelques leçons à en tirer : Les problèmes surviennent, donc une détection et correction rapide est primordiale, DNSSEC, pour lequel le registre suédois était pionnier, n a pas aidé. Si les données sont fausses, DNSSEC ne va pas les corriger. Garbage In, Garbage Out Garbage In, Garbage Out. Quelques articles sur le sujet : -

145 3 Sweden s Internet broken by DNS mistake ( 25E2%2580%2599s-internet-broken-by-dns-mistake/), analyse détaillée par Pingdom, I Don t Want to Say I Told You So... ( l avis d un expert, Crisis information in the modern society... ( information_in_the_mode.html), et d un autre expert, Tech glitch darkens Swedish websites ( des nouvelles par un journal suédois anglophone, Stora internetproblem f[caractère Unicode non montré 2 ]r Sverige ( inrikes/artikel_ svd) et des nouvelles pour ceux qui parlent la langue de Henning Mankell et Stieg Larsson. Le registre de.se a publié quelques mois après une étude très détaillée en anglais ( iis.se/docs/26875-svar-till-ptsv2-eng.pdf). Je dois aussi des remerciements à Jay Daley, David Blacka, Gilles Massen, Jakob Schlyter, Jelte Jansen et Olaf Kolkman pour leurs analyses et le partage d information. 2. Car trop difficile à faire afficher par LATEX -

146 4 Domain Name Server (DNS) 4.26 Annexe 8 Chapitre 4 Domain Name Server (DNS) 4.26 Annexe 8 Ci joint dans la version imprimée de ce cours, un article sur la panne du DNS de «.se». c T.Besançon (v ) Administration UNIX ARS Partie / 468

147 1 Swedish Post and Telecom Agency (PTS) Internet security division Attn: Camilla Grimelund Thomsen Box 5398 SE Stockholm Sweden Stockholm, November 12, 2009 Information regarding the disruption of the domain name system under the Swedish top-level domain.se (Your reference document number ) In a statement dated October 14, 2009, PTS requested information regarding the disruption on October 12, 2009 in the domain name system under the top-level domain.se. In its statement, PTS requested that.se submit information and account for a number of points, as presented below. 1. A full description of the circumstances regarding the disruption on October 12, 2009, which shall include the cause and scope of the disruption..se s response: In conjunction with scheduled maintenance work on the evening of Monday, October 12, 2009, a defective zone file was distributed at 9:39 p.m. The cause of the problem was a defective program update, which was not detected despite.se s test procedures and controls. The updated software was missing the trailing dot in.se. In such cases, the Bind program automatically adds.se to all domain names. This resulted in all domain names in the.se zone being changed so as to read domainname.se.se. As a result of a well-functioning monitoring system,.se immediately detected the defect, troubleshooting commenced and a new file containing DNS information (a zone file) was produced and distributed within an hour. After 30 minutes, the crisis management organisation was aware of conditions and could follow the recovery work, while working in parallel on spreading information. The cause of the incident was a number of coinciding events and circumstances, which when combined, resulted in the defective software being applied. The process leading

148 from the development to the application of new and changed software can generally be described as follows: 2 During the development process, every altered system component is tested in a separate development environment. The tests consist of manual function tests with test cases that are adapted to the system changed that occurred and shall be implemented by two independent developers. Prior to the delivery of a release, the entire integrated production system is tested using automated function tests with predefined test cases, known as pure tests. Following an approved test, the release is delivered to the commissioning organization. During the incident in question, the manual function tests performed on the concerned program modules for zone generation failed, and the test was only performed by one developer (not two, as stipulated by the procedures). The automatic tests do not encompass zone generation either, which rendered the defect more difficult to detect. The commissioning organization performs routine function tests and specific tests of the new functionality commissioned as part of the acceptance tests. These are limited to that which can be performed by way of the interactive user interfaces offered by the system. The commissioner subsequently approves the delivery and grants authorization for activation. The commissioning organization does not test the zone generation and thus did not detect the defect during this phase either. The operating organization performs an installation in a test stage pursuant to the documentation delivered by the development division. This documentation specifies the program corrections and new functions contained in the release. The document also specifies what program packages shall be installed in what server platforms and what configuration changes shall be made. In addition, the package must include instructions regarding the preparations that shall be made (such as what services are to be turned off and the temporary unplugging of surveillance), how to perform the installation (installing the package and launching services), what tests to perform following the installation and launch, and the routines for backtracking if problems arise. In addition, the commissioning organization performs a number of basic function tests to verify that the system s components are cooperating. The documentation that was delivered with the release in question did not contain any specific tests for the validity of the zone file, or any specific actions to stop zone distribution during the work. The documentation was also missing a description of the changes that were made to the zone generation program. The operation division s own routine tests do not encompass loading and testing the generated zone file to ensure its validity and thus the defect went undetected during this phase as well.

149 Activation in the production environment takes place on a scheduled and announced service occasion. On this occasion, one person from the development department must assist the responsible operations technician. Activation is carried out following the same documentation as for the test installation. 3 In this case, activation was carried out following the same documentation as in the test stage. Accordingly, no specific test of the zone file s validity was performed, nor was the scheduled zone distribution stopped. The automatic blocks that prevent unusually large changes to the zone file were deployed. Following a visual inspection of the generated data during which the missing dot was not detected, a decision was made to force the distribution of the defective zone file. Accordingly, the defective information was published via.se s slave server operator. The zone generator is a particularly critical component of.se s operations, which.se s technicians are aware of. An aggravating circumstance of the incident was that a senior system administrator had fallen acutely ill, which caused a less experienced system administrator to implement the activation of the new release. According to the applicable routines, the change should not have been implemented with only one system administrator on-site. Another contributing factor was that the system administrator who performed the change was not familiar with the specifics of how the zone signing functioned and did not have access to the machine that conducts the zone signing. Accordingly, a decision was also made to distribute a zone file with the correct information, but with an incorrect SOA signature for.se. Given the circumstances, this was the right decision and one which enabled the contamination of the cache in the name-resolver program to be stopped. 2. A description of the implications of the aforementioned disruption, such as its affect on the domain name system and the implications for various players, including name-server operators, domain-name holders and end users..se s response: Overall, the time of day, the scarce monitoring of our system,.se s crisis-management contingency, the speed at which corrections were made and our strong contacts with name-server operators in Sweden were to our advantage and resulted in the implications of the aforementioned incident being far less severe than they could have been. The defective information that was distributed resulted in complications regarding accessibility to all.se domains during the period in which the defective information was published. At the same time, the information in the name-resolver services was cached on the Internet for a certain period of time, and for many end users, everything functioned as usual. In.SE s opinion, the implications for domain-name holders and end users were mild, while efforts were required on the part of name-server operators (ISPs, registrars och web hosting providers) and probably resulted in activities in the form of trouble shooting and customer support..se has not received any formal complaints or damage claims.

150 4 The defect was successively detected and corrected during the evening, and by 1:00 a.m. on Tuesday, October 13, the.se zone was fully functioning. However, defective information remained cached in the name-resolver services, which was beyond the control of.se. Name-resolver services that requested.se domains during the period in which the defective zone was published (approximately 9:40 p.m. 10:50 p.m. CEST 1,) received failure responses that remained cached for up to one day. Requests regarding the.se zone itself, such as DNS directors and SOA posts, were cached for up to two days. Residual effects from the event could also theoretically have occurred for up to 48 hours. However, according to.se s monitoring and trouble shooting, all visible residual effects were ended as early as 24 hours later. Slightly more than an hour after the incident, an interim zone was published. The interim zone had the correct zone information but contained an invalid DNSSEC signature. Accordingly, the contamination of the cache in the name-resolver software stopped. During the period in which the interim zone was published (approximately 10:50 p.m. 00:55 a.m. CEST), Internet users mainly received the correct response, with the exception of cases in which the name resolver required SOA DNSSEC validation for.se. These situations led to request denials as well as implementation dependence. 3. Detailed description of the actions taken by.se concerning the incident. This description shall include the actions taken by.se to reduce the implications of the disruption and whether routines were in place to manage the incident and, if so, if these can be described or if a statement can be attached..se s response: One of the first actions is presented in response two above, namely the distribution of an interim zone to immediately stop the contamination of the cache in the name-resolver software. Through direct contacts with several major Swedish Internet operators, the effects of the disruption were minimized since these operators manually purged the name-resolver services caches as soon as the interim zone was published, thus avoiding the protracted effects that the matter could have resulted in..se also backtracked to a previous version of the zone generation script. Documentation is available concerning routines for the management of backtracks, incidents and more extensive crises. Incident-management routines applicable to the event concerned are described in brief as follows: The office and production operating environments are monitored and any alerts can be automatically received from.se s monitoring system or by someone reporting a defect through customer service, an emergency telephone number or . Automatic alerts are always sent SMS to.se s emergency telephone number. Depending on the nature of the warning, alerts can also be sent to other people in the organization through other channels, such as . During normal office hours, incidents are generally managed by technical operation personnel. After normal office hours, incidents are handled by on-call personnel. The party handling the alert makes an assessment 1 Central European Summer Time

151 based on the type of alert, statistics and personal inspections. When in doubt, other parties are contacted for consultation. If this occurs after normal working hours, people from the on-call group are those primarily contacted. Depending on the results of the assessment, the following functions may be contacted: - In the case of issues stemming from the distribution of zone files in which.se s partners must be notified, the on-call personnel for.se s slave server operations shall be contacted. - In the case of matters that require administrative/legal counsel, the appropriate member of.se s management shall be contacted. - In the case of security-related matters,.se s quality and security manger shall be contacted. When an incident is deemed to have the potential to lead to a crisis,.se s crisismanagement plan is activated. The crisis-management team makes a preliminary assessment of the nature of the crisis. The crisis-management team subsequently follows the instructions specified in the crisis-management plan. The crisis-management team decides what functions are to actively work with the team, taking into account the current scenario, meaning what work groups shall be drafted. Those who are not affected by the crisis return to their regular tasks. A task manager is appointed and leaves the crisis-management team to activate the crisis plan and head the operational management of the crisis. This involves assembling the resources necessary to manage the crisis, inform them of the situation and perform an analysis of the incident according to a checklist. The task manager reports the results of the analysis to the crisis-management team. During the incident on October 12, the technical personnel in question were already onsite due to scheduled maintenance work, and the cause of the event was thus promptly identified and a correction was initiated essentially immediately. Accordingly,.SE categorizes the event as a serious incident rather than a crisis situation. 4. A description of the information regarding the disruption submitted by.se to the concerned players (such as name-server operators, domain-name holders and end users) and when and how this took place. 5.SE s response: October 12 In conjunction with the activation of the crisis plan at 11:06 p.m., a number of activities were initiated including the dissemination of information. At 11:06 p.m., the security manager notifies the information manager and customer service manager of the events and tells them to prepare for contact with the press and customers, respectively. Ongoing contact with the information manager is maintained until 1:00 a.m. At 11:10 p.m.,.se s CEO contacts Aftonbladet newspaper; at 11:16 p.m., the TT news service is contacted; and at 11:45 p.m., Expressen is notified. The reasoning behind this is that the information will reach domain-name holders and end-users quicker through the media than if.se uses resources posting the information on its website.

152 6 At 11:27 p.m., the security manager notifies the crisis management team of the status of the events by . At 11:33 p.m., the CEO notifies the Board of Directors of the status of the incident by e- mail. At 12:12 a.m., the security manager notifies.se s DNS reference group, which included most major Swedish name-server operators, of the status of the incident. At 1:05 a.m., the security manager advises the crisis-management team that the problem has been resolved and announces a return to standard operations. October 13 At 7:02 a.m., internal information is distributed to notify customer service staff and other personnel. All press contact is referred to the information manager or CEO. At 7:11 a.m., brief information is provided to the Swedish Post and Telecom Agency, the supervising authority. At 8:24 a.m., information is sent to the DNS reference group list with advice on how to purge the resolvers. At 8:30 a.m., a scheduled status meeting is held with the concerned members of the crisis-management team to obtain as much information as possible, and an internal investigation headed by the security manager is initiated. At 9:31 a.m., the information is compiled in a document that is distributed internally and posted on.se s website. At 9:30 a.m., the information is sent to the SOF group by . At 9:48 a.m., brief information is sent to.se s registrars in Swedish. At 10:17 a.m., brief information is sent to.se s registrars in English. At 11:19 a.m., supplementary information is sent to PTS. October 14 Detailed information regarding the incident is posted on.se s website and sent to.se s registrars and to the DNS reference group. October 15 Detailed information regarding the incident is sent to CENTR Full Members. PTS is updated regarding the information that was sent to Registrars and DNS operators.

153 5. A description of future measures that.se plans to implement to avoid similar disruptions from occurring. 7.SE s response: The most urgent actions taken were naturally to investigate the incident on October 12. An internal investigation commenced immediately on the morning of October 13. Two separate external investigations were subsequently initiated: one technically oriented investigation and one more focused on organization, responsibilities and routines. The IT operations group was reinforced with a temporary senior operations technician. One thing we established early on is that we are lacking channels for reaching ISPs/web hosting providers and resolver operators located outside of Sweden. Therefore we have started a global improvement initiative aiming at finding forms of creating a possibility to get this kind of contact information to all the large operators around the world, if need be. We have started discussions with various parties on this issue. Furthermore,.SE has distributed a generic request to all concerned players for suggestions for internal and external improvement measures. Submitted suggestions are being compiled, analyzed and prioritized. We are working on the formulation of an action plan based on these suggested measures and those improvement proposals that have surfaced in both the internal and external inquiries. Our routines have also been tightened up. This work is being coordinated by.se s security manger. A steering group has been appointed to make decisions regarding prioritizations and establishing who is responsible for implementing actions. Specific resources have been allocated for the additional expenses caused by the incident. The incident and the management thereof eill continue to be discussed in.se s Board of Directors at a meeting on November 23. We welcome a meeting with PTS representatives if so desired, for a more comprehensive review of the actions taken and will be taken. Danny Aerts, CEO

154 4 Domain Name Server (DNS) 4.27 Annexe 9 Chapitre 4 Domain Name Server (DNS) 4.27 Annexe 9 Ci joint dans la version imprimée de ce cours, un article sur la panne du DNS de «.cn». c T.Besançon (v ) Administration UNIX ARS Partie / 468

155 La grande panne DNS de Chine de mai 2009 Stéphane Bortzmeyer Première rédaction de cet article le 6 Novembre Le 19 mai 2009, la Chine a connu sa plus grande panne de l Internet. Sur le moment, de nombreux articles ont été publiés, sans détails pratiques la plupart du temps, à part le fait qu il s agirait d un problème lié au DNS. Le 5 novembre, à la réunion OARC ( à Pékin, Ziqian Lu, de China Telecom, a fait un remarquable exposé ( workshop /ziqian_liu.pdf) détaillant les causes de la panne. C est un bel exercice de transparence, avec plein de détails techniques. Je ne suis pas sûr que les opérateurs Internet français en fassent autant, si une telle panne frappait la France. Donc, le 19 mai vers 21 h, heure locale, les téléphones se mettent à sonner : l Internet est en panne. China Telecom, mais également d autres FAI, constate à la fois que les utilisateurs se plaignent mais que le trafic a chuté considérablement. Les tuyaux ne sont donc pas surchargés, bien au contraire. En outre, le service n est pas complètement interrompu : parfois, cela marche encore un peu. En raison de la baisse du trafic, on soupçonne un problème dans le routage. Il faut un certain temps pour que quelqu un remarque ces lignes dans le journal des serveurs de noms récursifs du FAI : 19-May :21: client: warning: client #51939: recursive-clients soft limit exceeded, ab 19-May :21: client: warning: client #1151: recursive-clients soft limit exceeded, abor Et la vérité se fait jour : le problème est dans le service DNS récursif. La majorité des requêtes DNS échouent. Quelques unes passent, ce qui explique que l Internet ne semble pas complètement en panne à certains utilisateurs. Comme presque toutes les activités Internet dépendent du DNS, le trafic réseau chute. Pour résumer la panne, il y avait bien une attaque DoS mais, comme au billard, l attaque n a pas frappé directement les serveurs DNS. L attaque a touché un service très populaire, Baofeng, qui distribue de la vidéo et de la musique. Les attaquants frappent un serveur de jeux en ligne et l arrêtent. Rien d extraordinaire, ce genre de choses arrive tous les jours sur l Internet. Sauf que l attaque stoppe également certains des serveurs du domaine baofeng.com, qui partagent la même infrastructure. Et que les logiciels clients de Baofeng, devant la panne, réagissent en faisant encore plus de requêtes DNS, qui restent sans réponse. Le logiciel résolveur utilisé par tous les FAI chinois, BIND, a une limite sur le nombre de requêtes DNS récursives en attente, 1000 par défaut. En temps normal, c est largement suffisant mais, ici elle était vite remplie par les innombrables requêtes en attente des serveurs de baofeng.com. Si vous gérez un récurseur BIND, vous pouvez voir l état des requêtes en cours avec rndc : 1

156 2 % rndc status... recursive clients: 13/1900/2000 Le dernier chiffre est la limite dure au nombre de requêtes en attente (il se règle avec l option recursive-clients dans named.conf). L avant-dernier est la limite douce à partir de laquelle BIND commencera à laisser tomber des requêtes, provoquant le message ci-dessus. Le premier chiffre est le nombre de requêtes actuellement en attente. En raison du nombre de clients attendant baofeng.com, cette limite a été vite dépassée, supprimant toute résolution DNS, même pour les domaines n ayant rien à voir avec Baofeng. Pour votre récurseur, faites le calcul : prenez la différence entre la limite douce et le nombre de clients en temps normal (ici, c est , mettons 1900) et divisez là par le taux de requêtes : cela vous donnera une idée du nombre de secondes que vous pourrez tenir en cas de panne. Ici, si le taux de requêtes est de 100 par seconde (ce qui est une valeur pour un petit FAI), vous avez droit à seulement dix-neuf secondes de marge en cas de panne d un gros domaine très populaire... La plupart des récurseurs ont probablement une valeur de recursive-clients trop basse. Conclusion : si quelqu un réussit à planter tous les serveurs DNS de google.com ou ebay.com, il peut théoriquement planter tout le DNS et donc tout l Internet. Dans le cas chinois, tous les résolveurs étaient des BIND (comme c est probablement le cas dans la plupart des pays). Il n a pas été possible de tester avec d autres résolveurs comme Unbound mais rien n indique qu ils auraient fait mieux. Le choix des développeurs de BIND était d avoir un tableau de taille limitée pour les requêtes en attente. Si ce tableau était par contre dynamique, le récurseur aurait, à la place, avalé toute la mémoire du serveur. Quelques-uns des articles les moins mal informés qui ont été publiés sur cette panne : DNS Attack Downs Internet in Parts of China ( article/165319/dns_attack_downs_interne) A month after web chaos, Baofeng issues new media player ( content/month-after-web-chaos-baofeng-issues-new-media-player) Internet attack organized says Ministry ( 2009/200905/ /article_ htm) Merci à Ziqian Lu pour ses explications détaillées. -

157 Chapitre 5 SSH Explications sur le pourquoi de ce chapitre Tout ce qu un administrateur système a jamais voulu savoir sur Ethernet. Certainement beaucoup de doublons avec ce qui aura été vu avec l enseignement RESEAU. c T.Besançon (v ) Administration UNIX ARS Partie / SSH 5.1 Introduction Chapitre 5 SSH 5.1 Introduction Voir cours de Frédérique BONGAT. c T.Besançon (v ) Administration UNIX ARS Partie / 468

158 Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd Explications sur le pourquoi de ce chapitre c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.1 Rappel sur les connexions IP Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.1 Rappel sur les connexions IP Rappel : Une connexion IP est constituée de plusieurs éléments : une adresse IP source un numéro de port source sur la machine de départ une adresse IP de destination un numéro de port sur la machine de destination protocole TCP ou UDP Attention : Sur UNIX, la fonction C obtenant un port source < 1024 ne fonctionne que pour l UID 0 d où l utilisation du SetUID 0. Sur Windows, la fonction C obtenant un port source < 1024 fonctionne quel que soit l UID c T.Besançon (v ) Administration UNIX ARS Partie / 468

159 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.2 Fichier /etc/services Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.2 Fichier /etc/services Le fichier «/etc/services» mentionne des triplets (numéro de port, protocole, nom du service). Voir dans les chapitres précédents. c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.3 Gestionnaires de services réseau Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.3 Gestionnaires de services réseau Une machine UNIX offre de nombreux services accessibles par le réseau. Souvent les services réseau contactés durent peu de temps : inutile de les faire tourner en permanence (consommation de ressources) on va les activer uniquement suite à une requête On utilise alors un «gestionnaire de services» pour écouter les requêtes pour certains services réseau et lancer ces services. Plus particulièrement, le gestionnaire de services assure : l attente de connexions réseau sur certains ports le lancement des services contactés le comportement adapté au lancement de ces services c T.Besançon (v ) Administration UNIX ARS Partie / 468

160 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.3 Gestionnaires de services réseau Il existe plusieurs gestionnaires de services : programme INETD programme XINETD Autre programme TCPD : ce n est pas un gestionnaire de services c est un contrôleur de services c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.4 Gestionnaire de services réseau : INETD Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.4 Gestionnaire de services réseau : INETD Etape 1 host1.example.com host2.example.com client inetd réseau c T.Besançon (v ) Administration UNIX ARS Partie / 468

161 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.4 Gestionnaire de services réseau : INETD Etape 2 host1.example.com host2.example.com fork() client inetd exec() serveur réseau c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.4 Gestionnaire de services réseau : INETD Etape 3 host1.example.com host2.example.com client inetd serveur réseau c T.Besançon (v ) Administration UNIX ARS Partie / 468

162 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.5 INETD : fichier de configuration /etc/inetd.conf Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.5 INETD : fichier de configuration /etc/inetd.conf Le fichier «/etc/inetd.conf» est spécifique au gestionnaire de services «inetd». Son format est le suivant : # Syntax for socket-based Internet services: # service_name socket_type proto flags user server_pathname args Par exemple :... ftp stream tcp nowait root /usr/sbin/in.ftpd in.ftpd telnet stream tcp nowait root /usr/sbin/in.telnetd in.telnetd... c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.5 INETD : fichier de configuration /etc/inetd.conf La signification des champs est la suivante : champ 1 : «service_name» C est le nom symbolique d un service (cf «/etc/services»). champ 2 : «socket_type» C est le type du socket réseau. C est essentiellement «stream», «dgram». champ 3 : «proto» C est le protocole réseau utilisé : «tcp», «udp» champ 4 : «flags» Cela indique si l on peut répondre à une requête du même type alors que la première n est pas terminée. On peut avoir : «wait», «nowait» champ 5 : «user» Nom de l utilisateur sous lequel le programme tournera. champ 6 : «server_pathname» C est le chemin absolu du programme à exécuter. champ 7 : «args» Ce sont les paramètres à donner lors du «execv()». c T.Besançon (v ) Administration UNIX ARS Partie / 468

163 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.5 INETD : fichier de configuration /etc/inetd.conf Par exemple : ftp stream tcp nowait root /usr/sbin/in.ftpd in.ftpd -l donnera execv("/usr/sbin/in.ftpd", "in.ftpd", "-l", 0); c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.6 INETD : reconfiguration via SIGHUP Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.6 INETD : reconfiguration via SIGHUP Inetd est lancé par les scripts de démarrage. Si l on modifie le fichier «/etc/inetd.conf», la modification est prise en compte en envoyant le signal «SIGHUP» au processus inetd : # ps -ax grep inetd 173? IW 2 0:00 inetd p2 S 2 0:00 grep inetd # kill -HUP 173 c T.Besançon (v ) Administration UNIX ARS Partie / 468

164 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.7 INETD : problèmes Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.7 INETD : problèmes INETD fonctionne très bien en pratique. Il a un seul gros défaut : il ne gère aucun aspect de sécurité. On aimerait au moins : pouvoir filtrer l accès à certains services avoir des traces d activation de certains services Un remède : emploi du logiciel TCP WRAPPERS « « c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.8 Gestionnaire de services réseau : TCP WRAPPERS Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.8 Gestionnaire de services réseau : TCP WRAPPERS L idée de TCP WRAPPERS : on va s intercaler dans la chaîne de INETD. host1.example.com host2.example.com client inetd réseau c T.Besançon (v ) Administration UNIX ARS Partie / 468

165 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.8 Gestionnaire de services réseau : TCP WRAPPERS host1.example.com host2.example.com client inetd fork() exec() tcpd réseau c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.8 Gestionnaire de services réseau : TCP WRAPPERS host1.example.com host2.example.com client inetd tcpd réseau c T.Besançon (v ) Administration UNIX ARS Partie / 468

166 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.8 Gestionnaire de services réseau : TCP WRAPPERS host1.example.com client host2.example.com fork() exec() inetd tcpd serveur réseau c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.8 Gestionnaire de services réseau : TCP WRAPPERS host1.example.com host2.example.com client inetd tcpd serveur réseau c T.Besançon (v ) Administration UNIX ARS Partie / 468

167 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.8 Gestionnaire de services réseau : TCP WRAPPERS host1.example.com host2.example.com client inetd serveur réseau c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.8 Gestionnaire de services réseau : TCP WRAPPERS La librairie des TCP WRAPPERS est maintenant intégrée à beaucoup de produits (même sous WINDOWS...). c T.Besançon (v ) Administration UNIX ARS Partie / 468

168 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.9 TCP WRAPPERS : modifications de /etc/inetd.conf Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.9 TCP WRAPPERS : modifications de /etc/inetd.conf S intercaler dans la chaîne de INETD nécessite de modifier le fichier «/etc/inetd.conf» Par exemple on passe de : ftp stream tcp nowait root /usr/etc/in.ftpd in.ftpd -l telnet stream tcp nowait root /usr/etc/in.telnetd in.telnetd shell stream tcp nowait root /usr/etc/in.rshd in.rshd login stream tcp nowait root /usr/etc/in.rlogind in.rlogind à ftp stream tcp nowait root /chemin/vers/tcpd in.ftpd -l telnet stream tcp nowait root /chemin/vers/tcpd in.telnetd shell stream tcp nowait root /chemin/vers/tcpd in.rshd login stream tcp nowait root /chemin/vers/tcpd in.rlogind c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.9 TCP WRAPPERS : modifications de /etc/inetd.conf Rappel sur l interprétation d une ligne de «inetd.conf» : Par exemple : ftp stream tcp nowait root /chemin/vers/tcpd in.ftpd -l donnera execv("/chemin/vers/tcpd", "in.ftpd", "-l", 0); c T.Besançon (v ) Administration UNIX ARS Partie / 468

169 6 Gestionnaires de services réseau : inetd, tcpd, xinetd6.10 TCP WRAPPERS : contrôle d accès, /etc/hosts.allow, /etc/hosts.deny Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.10 TCP WRAPPERS : contrôle d accès, /etc/hosts.allow, /etc/hosts.deny Le principe du contrôle d accès des TCP WRAPPERS repose sur les fichiers «/etc/hosts.allow» et «/etc/hosts.deny» : 1 On vérifie d abord si la requête TCP est autorisée par le contenu de «/etc/hosts.allow». Si oui, OK. Si non, on passe à l étape 2. 2 On vérifie si la requête TCP est interdite par le contenu de «/etc/hosts.deny». Si oui, la requête est rejetée. Si non, on passe à l étape 3. 3 On accepte la requête TCP. c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd6.10 TCP WRAPPERS : contrôle d accès, /etc/hosts.allow, /etc/hosts.deny Au niveau des fichiers «hosts.allow» et «hosts.deny», on peut appliquer des règles de filtrage par service servi. Voici un exemple de politique de connexion : au niveau du fichier «/etc/hosts.allow» : in.telnetd:.fr, in.rlogind:.fr, in.rshd:.fr au niveau du fichier «/etc/hosts.deny» : in.telnetd: in.rlogind: in.rshd: ALL ALL ALL En français intelligible, ces fichiers indiquent que les connexions telnet, rlogin, rsh ne sont autorisées que depuis des machines du domaine français «.fr» et depuis un domaine de Tunisie (réseau d adresse ). c T.Besançon (v ) Administration UNIX ARS Partie / 468

170 6 Gestionnaires de services réseau : inetd, tcpd, xinetd6.10 TCP WRAPPERS : contrôle d accès, /etc/hosts.allow, /etc/hosts.deny Une seconde syntaxe existe pour TCPD. Utiliser cette syntaxe! Syntaxe : «service : designation-machines : allow» «service : designation-machines : deny» La désignation des machines peut se faire de diverses façons : noms de machines («serveur.example.com») réseaux de noms de machines («.example.com») adresses CIDR de réseaux (« / ») adresses de réseaux (« ») nom spécial : «ALL» Nom de service spécial : «ALL» Tout autoriser : «ALL : ALL : allow» Tout interdire : «ALL : ALL : deny» c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd6.10 TCP WRAPPERS : contrôle d accès, /etc/hosts.allow, /etc/hosts.deny ATTENTION : quand on interdit une machine, il faut l interdire via son nom FQDN et via son adresse IP (pour le cas où un nameserver serait injoignable on peut ainsi bloquer quand même la machine). c T.Besançon (v ) Administration UNIX ARS Partie / 468

171 6 Gestionnaires de services réseau : inetd, tcpd, xinetd6.10 TCP WRAPPERS : contrôle d accès, /etc/hosts.allow, /etc/hosts.deny ATTENTION : Ce n est par parce que l on active tcpd que l on a résolu tous les problèmes de sécurité et que l on est tranquille! Il ne faut pas oublier que plein d autres services ne passent pas par l intermédiaire de inetd.conf Il faut surveiller les traces de fonctionnement renvoyées par tcpd. Il faut renseigner les fichiers hosts.allow et /etc/hosts.deny. Il faut des démons sans trou de sécurité. c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.11 TCP WRAPPERS : programmation via libwrap.a Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.11 TCP WRAPPERS : programmation via libwrap.a Les TCP WRAPPERS offrent aussi une librairie de programmation : «libwrap.a» et «tcpd.h» Généralement installée en «/usr/local/lib/libwrap.a» et «/usr/local/include/tcpd.h» La librairie apporte la fonction C «host_access()» On linkera avec cette librairie quand c est nécessaire. c T.Besançon (v ) Administration UNIX ARS Partie / 468

172 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.12 Gestionnaire de services réseau : XINETD Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.12 Gestionnaire de services réseau : XINETD (en anglais extended inetd) Cf « Numérotation des versions un peu compliquée... XINETD est une réécriture complète de INETD en incorporant plusieurs aspects manquants dans INETD : contrôle d accès à la TCP WRAPPERS accès horaires aux démons traces syslog des connexions (échouées, abouties) limitation du nombre d instances de chaque démon binding sur certaines adresses réseau c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.13 XINETD : fichier de configuration /etc/xinetd.conf Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, 6.13 XINETD : fichier de configuration /etc/xinetd.conf XINETD utilise le fichier «/etc/xinetd.conf», incompatible avec «/etc/inetd.conf». Format du fichier : defaults { attribut operateur valeur(s)... } service toto { attribut operateur valeur(s)... } Les opérateurs sont «=», «+=» et «-=». Cf la documentation pour la liste complète des attributs. c T.Besançon (v ) Administration UNIX ARS Partie / 468

173 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.13 XINETD : fichier de configuration /etc/xinetd.conf A noter avec certaines versions de «xinetd» la possibilité d avoir un répertoire «/etc/xinetd.d» dans lequel on trouve un fichier par service, portant le nom du service et contenant le réglage du service. Par exemple «/etc/xinetd.d/ftpd» c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.14 XINETD : réglages par défaut Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.14 XINETD : réglages par défaut Par exemple : defaults { instances = 15 log_type = FILE /var/log/servicelog log_on_success = HOST PID USERID DURATION EXIT log_on_failure = HOST USERID RECORD only_from = disabled = shell login exec comsat telnet ftp tftp finger disabled += time daytime chargen servers services xadmin } Ici la ligne «only_from =» interdit par défaut toutes les machines à se connecter aux services. On autorisera ce qui est nécessaire au niveau du bloc de configuration de chaque service. c T.Besançon (v ) Administration UNIX ARS Partie / 468

174 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.15 XINETD : configuration d un service Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.15 XINETD : configuration d un service Exemple pour le service «ftp» (cf «/etc/services») : service ftp { socket-type = stream wait = no user = root server = /usr/sbin/in.ftpd server_args = -l instances = 4 access_times = 7:00-12:30 13:30-21:00 nice = 10 only_from = /24 } c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.16 XINETD : /etc/xinetd.conf : directive nice Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.16 XINETD : /etc/xinetd.conf : directive nice Par exemple : service ftp { socket-type = stream wait = no user = root server = /usr/sbin/in.ftpd server_args = -l instances = 4 access_times = 7:00-12:30 13:30-21:00 nice = 10 only_from = /24 } c T.Besançon (v ) Administration UNIX ARS Partie / 468

175 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.17 XINETD /etc/xinetd.conf : directive access_times Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.17 XINETD : /etc/xinetd.conf : directive access_times Par exemple : service ftp { socket-type = stream wait = no user = root server = /usr/sbin/in.ftpd server_args = -l instances = 4 access_times = 7:00-12:30 13:30-21:00 nice = 10 only_from = /24 } c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.18 XINETD : /etc/xinetd.conf : directives bind, id Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.18 XINETD : /etc/xinetd.conf : directives bind, id Soit une machine avec 2 interfaces réseau d adresses « » et « » : service ftp { id = ftp-public bind = socket-type = stream wait = no user = root server = /usr/sbin/in.ftpd server_args = -l instances = 4 access_times = 7:00-21:00 nice = 10 only_from = /24 } service ftp { id = ftp-private bind = socket-type = stream wait = no user = root server = /usr/sbin/in.ftpd server_args = -l instances = 4 access_times = 7:00-21:00 nice = 10 only_from = /24 } La directive «id» servira au niveau de SYSLOG à différencier les traces de l un ou l autre de FTPD. c T.Besançon (v ) Administration UNIX ARS Partie / 468

176 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.19 XINETD : /etc/xinetd.conf : directive redirect Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.19 XINETD : /etc/xinetd.conf : directive redirect Par exemple : service telnet { flags = REUSE socket-type = stream wait = no user = root server = /usr/sbin/in.telnetd only_from = /24 redirect = } Et un «telnet» vers « » renverra vers le démon «telnetd» de « ». c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.20 XINETD : /etc/xinetd.conf : et directive NAMEINARGS Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.20 XINETD : /etc/xinetd.conf : tcpd et directive NAMEINARGS Par exemple : service ftp { flags = NAMEINARGS REUSE socket-type = stream wait = no user = root server = /usr/sbin/tcpd server_args = /usr/sbin/in.ftpd -l instances = 4 access_times = 7:00-12:30 13:30-21:00 nice = 10 only_from = /24 } TCP WRAPPERS et XINETD ne sont pas antinomiques. c T.Besançon (v ) Administration UNIX ARS Partie / 468

177 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.21 XINETD : /etc/xinetd.conf : directive chroot Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.21 XINETD : /etc/xinetd.conf : directive chroot Par exemple : service ftp { socket-type = stream wait = no user = root server = /usr/sbin/chroot server_args = /quelquepart/blockhaus/ftp /usr/sbin/in.ftpd -l instances = 4 access_times = 7:00-12:30 13:30-21:00 nice = 10 only_from = /24 } On est ainsi compartimenté à l arborescence de «/quelquepart/blockhaus/ftp» dont on ne peut pas sortir. Intérêt pour la sécurité de la machine. c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.22 XINETD : reconfiguration, signaux Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.22 XINETD : reconfiguration, signaux Sont supportés les signaux : signal «SIGHUP» : sur sa réception, «xinetd» se reconfigure signal «SIGTERM» : sur sa réception, «xinetd» se saborde signal «SIGUSR1» : sur sa réception, «xinetd» écrit le fichier «/var/run/xinetd.dump» c T.Besançon (v ) Administration UNIX ARS Partie / 468

178 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.23 Services internes inutiles Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.23 Services internes inutiles Rappel : INETD et XINETD lancent des services réseau sur demande Exemple avec INETD : extrait de «/etc/inetd.conf» : ftp stream tcp nowait root /usr/sbin/in.ftpd in.ftpd -l donnera execv("/usr/sbin/in.ftpd", "in.ftpd", "-l", 0); Le service est fourni par un exécutable «externe». Mais le service peut être interne et être fourni directement par INETD ou XINETD. c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.23 Services internes inutiles Règle fondamentale : tous les services internes sont inutiles et doivent être désactivés. Par exemple, pour INETD : désactiver tous les services de type internal dans «/etc/inetd.conf» : #daytime stream tcp nowait root internal #daytime dgram udp wait root internal #time stream tcp nowait root internal #time dgram udp wait root internal #echo stream tcp nowait root internal #echo dgram udp wait root internal #discard stream tcp nowait root internal #discard dgram udp wait root internal #chargen stream tcp nowait root internal #chargen dgram udp wait root internal c T.Besançon (v ) Administration UNIX ARS Partie / 468

179 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.24 R-Services Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.24 R-Services Plusieurs services lancés par INETD et XINETD sont connus sous le nom de R-services (remote services) : service RLOGIND : connexion interactive à un système distant service RSHD : lancement d une commande sur un système distant service RCPD : recopie de fichiers locaux vers des fichiers distants ou vice-versa c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.25 R-service rlogind, R-commande rlogin Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.25 R-service rlogind, R-commande rlogin (en anglais remote login daemon, remote login) En général, le R-service a pour chemin : «/usr/sbin/in.rlogind» ou «/usr/sbin/rlogind». Syntaxe de la R-commande : rlogin [-l user] nom-de-machine % rlogin -l besancon serveur.example.com Password: XXXXXXXX Last login: Mon Sep 15 09:42:58 from Sun Microsystems Inc. SunOS 5.5 Generic November 1995 server% A chaque fois que ce sera possible, préférer une connexion en SSH (ici avec «ssh») car «rlogin» transmet le mot de passe en clair sur le réseau. c T.Besançon (v ) Administration UNIX ARS Partie / 468

180 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.25 R-service rlogind, R-commande rlogin Contrôles d accès (1) Deux fichiers contrôlent les accès : fichier système «/etc/hosts.equiv» fichier personnel «$HOME/.rhosts» Syntaxe commune aux deux fichiers : format 1 : «hostname» format 2 : «hostname username» Deux cas de figure lors d une connexion : une ligne à l un des deux formats autorise la connexion et la connexion se fait alors sans demande de mot de passe aucune ligne n autorise la connexion et il y a alors demande du mot de passe de l utilisateur local c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.25 R-service rlogind, R-commande rlogin Possibilité de mettre «+» à la place de n importe quel champ : signification «n importe quelle machine» signification «n importe quel utilisateur» Voir fonction C «ruserok()». c T.Besançon (v ) Administration UNIX ARS Partie / 468

181 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.25 R-service rlogind, R-commande rlogin Contrôles d accès (2) : Format «hostname» Signification : les utilisateurs de la machine «hostname» sont autorisés à se connecter au système sous le même nom sans demande de mot de passe. Ce format est utilisable dans «/etc/hosts.equiv» et dans les fichiers personnels «$HOME/.rhosts». c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.25 R-service rlogind, R-commande rlogin Par exemple, si le fichier «/etc/hosts.equiv» contient : cerise.example.com cela permet à un utilisateur «martin» de se connecter en RLOGIN depuis la machine «cerise.example.com» sous l identité locale «martin» sans demande de mot de passe. Par exemple, si l utilisateur «jean» a le fichier «$HOME/.rhosts» suivant : cerise.example.com cela permet à l utilisateur «jean» de se connecter en RLOGIN depuis la machine «cerise.example.com» sous l identité locale «jean» sans demande de mot de passe. c T.Besançon (v ) Administration UNIX ARS Partie / 468

182 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.25 R-service rlogind, R-commande rlogin Contrôles d accès (3) : Format «hostname username» Signification : l utilisateur «username» de la machine «hostname» est autorisé à se connecter au système sans demande de mot de passe. Si ce format est utilisé dans «$HOME/.rhosts», alors cela autorise un utilisateur distant à se connecter sous le nom d un utilisateur local. Si ce format est utilisé dans «/etc/hosts.equiv» alors cela autorise un utilisateur distant à se connecter sous le nom de n importe quel utilisateur local. c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.25 R-service rlogind, R-commande rlogin Par exemple, si l utilisateur «jean» a le fichier «$HOME/.rhosts» suivant : cerise.example.com martin cela permet à l utilisateur «martin» de se connecter en RLOGIN depuis la machine «cerise.example.com» sous l identité locale «jean» sans demande de mot de passe. Par exemple, si le fichier «/etc/hosts.equiv» contient : cerise.example.com jean cela permet à l utilisateur «jean» de se connecter en RLOGIN depuis la machine «cerise.example.com» sous n importe quelle identité locale sans demande de mot de passe. DANGEREUX. c T.Besançon (v ) Administration UNIX ARS Partie / 468

183 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.25 R-service rlogind, R-commande rlogin Problèmes : Attention au contenu de «/etc/hosts.equiv». SunOS 4.x.y fournissait un fichier contenant «+ +». Attention aux connexions root via le réseau. Il faut interdire les connexions root via le réseau parce qu elles sont anonymes. Vérifier «/etc/ttys» (ou équivalent «/etc/securettys»...) et minimaliser le nombre de terminaux sécurisés. Vérifier «~root/.rhosts». C est une cible privilégiée des pirates qui essayent d y écrire «+ +». Il n y a pas de traces des connexions locales. Au mieux, tcpd informe d où vient la connexion mais pas de l identité prise sur la machine locale. Remède : utiliser le package logiciel logdaemon URL «ftp://ftp.porcupine.org/pub/security/logdaemon-5.8.tar.gz» c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.26 R-service rshd, R-commande rsh Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.26 R-service rshd, R-commande rsh (en anglais remote shell daemon, remote shell) En général, le R-service a pour chemin : «/usr/sbin/in.rshd» ou «/usr/sbin/rshd». Syntaxe de la R-commande : rsh -l username hostname command % rsh -l besancon server.example.com date Sun Oct 12 15:20:28 MET DST 2003 c T.Besançon (v ) Administration UNIX ARS Partie / 468

184 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.26 R-service rshd, R-commande rsh Si l on ne précise pas de commande, on lance un shell en interactif. % rsh server.example.com Sun Microsystems Inc. SunOS 5.10 Generic January 2005 You have new mail. server% A chaque fois que ce sera possible, préférer une connexion en SSH (ici avec «ssh»). c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.26 R-service rshd, R-commande rsh Contrôles d accès Les contrôles d accès se font comme page 260 (chapitre sur RLOGIND) sauf pour un point : soit la connexion en RSH est explicitement autorisée par les fichiers «/etc/hosts.equiv» ou «$HOME/.rhosts» et la connexion se fait alors sans demande de mot de passe soit la connexion n est pas explicitement autorisée et elle est alors refusée et on ne demande pas de mot de passe. % rsh -l besancon server.example.com date permission denied c T.Besançon (v ) Administration UNIX ARS Partie / 468

185 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.27 R-service rcpd, R-commande rcp Chapitre 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.27 R-service rcpd, R-commande rcp (en anglais remote copy daemon, remote copy) En général, le R-service a pour chemin : «/usr/sbin/in.rcpd» ou «/usr/sbin/rcpd». Plusieurs syntaxes de la R-commande : distant vers local : rcp [-r] user@machine:filename path local vers distant : rcp [-r] path user@machine:filename c T.Besançon (v ) Administration UNIX ARS Partie / Gestionnaires de services réseau : inetd, tcpd, xinetd 6.27 R-service rcpd, R-commande rcp % rcp [email protected]:ananas.txt. % ls -l ananas -rw-r--r-- 1 besancon ars 15 Oct 12 15:18 ananas.txt % rcp ananas.txt [email protected]:/tmp/cerise.txt % rsh -l besancon server.example.com ls -l /tmp/cerise.txt -rw-r--r-- 1 besancon ars 15 Oct 12 15:22 /tmp/cerise.txt A chaque fois que ce sera possible, préférer une connexion en SSH (ici avec «scp»). c T.Besançon (v ) Administration UNIX ARS Partie / 468

186 6 Gestionnaires de services réseau : inetd, tcpd, xinetd 6.27 R-service rcpd, R-commande rcp Contrôles d accès Les contrôles d accès se font comme page 260 (chapitre sur RLOGIND) sauf pour un point : soit la connexion en RCP est explicitement autorisée par les fichiers «/etc/hosts.equiv» ou «$HOME/.rhosts» et la connexion se fait alors sans demande de mot de passe soit la connexion n est pas explicitement autorisée et elle est alors refusée et on ne demande pas de mot de passe. % rcp ananas.txt [email protected]:/tmp/banane.txt permission denied c T.Besançon (v ) Administration UNIX ARS Partie / 468

187 Chapitre 7 Protocoles de transferts de fichiers FTP, TFTP Explications sur le pourquoi de ce chapitre c T.Besançon (v ) Administration UNIX ARS Partie / Protocoles de transferts de fichiers FTP, TFTP 7.1 Protocole FTP, ftp, ftpd Chapitre 7 Protocoles de transferts de fichiers FTP, TFTP 7.1 Protocole FTP, ftp, ftpd (en anglais File Transfer Protocol) Une connexion via FTP nécessite que le service FTP soit activé au niveau de INETD ou XINETD ou bien tourne en mode autonome/démon. Protocole compliqué au niveau réseau : dans «/etc/services», on voit 2 ports assignés au protocole FTP : ftp-data ftp 20/tcp 21/tcp Deux protocoles FTP en fait : FTP actif (mode par défaut sous Windows) FTP passif (mode par défaut sous Linux) c T.Besançon (v ) Administration UNIX ARS Partie / 468

188 7 Protocoles de transferts de fichiers FTP, TFTP 7.1 Protocole FTP, ftp, ftpd FTP actif CLIENT FTP FTP control connection SERVEUR FTP High port 1 port 21 FTP data connection High port 2 port 20 c T.Besançon (v ) Administration UNIX ARS Partie / Protocoles de transferts de fichiers FTP, TFTP 7.1 Protocole FTP, ftp, ftpd Le protocole FTP est «complexe» car c est un dialogue bidirectionnel : Un client FTP se connecte sur le port 21 («ftp» dans «/etc/services») du serveur FTP et ce port 21 sert à envoyer des commandes au serveur FTP. Si les commandes nécessitent que des données soient reçues (commandes «dir», «get» par exemple) ou transmises («put» par exemple) au serveur, le client envoie une commande «PORT» au serveur indiquant un port sur lequel le serveur va créer une connexion depuis le port 20 («ftp-data» dans «/etc/services»). La connexion FTP-DATA est close client dès que toutes les données sont transférées. port connexion PORT DIR data serveur port 21 port 21 port 21 port 20 c T.Besançon (v ) Administration UNIX ARS Partie / 468

189 7 Protocoles de transferts de fichiers FTP, TFTP 7.1 Protocole FTP, ftp, ftpd Exemple : % ftp -v -d localhost Connected to localhost. 220 cerise FTP server (SunOS 5.8) ready. Name (localhost:besancon): besancon ---> USER besancon 331 Password required for besancon. Password: ---> PASS XXXXXXXXX 230 User besancon logged in. ftp> lcd /tmp Local directory now /tmp ftp> cd /etc ---> CWD /etc 250 CWD command successful. ftp> get motd ---> PORT 127,0,0,1,129, PORT command successful. ---> RETR motd 150 ASCII data connection for motd ( ,33085) (54 bytes). 226 ASCII Transfer complete. local: motd remote: motd 55 bytes received in seconds (62.53 Kbytes/s) ftp> quit ---> QUIT 221 Goodbye. c T.Besançon (v ) Administration UNIX ARS Partie / Protocoles de transferts de fichiers FTP, TFTP 7.1 Protocole FTP, ftp, ftpd FTP passif CLIENT FTP FTP control connection SERVEUR FTP High port 1 port 21 FTP data connection High port 2 High port 3 c T.Besançon (v ) Administration UNIX ARS Partie / 468

190 7 Protocoles de transferts de fichiers FTP, TFTP 7.1 Protocole FTP, ftp, ftpd Contrôle d accès Au niveau contrôle d accès, les implémentations de base proposent : le fichier «/etc/ftpusers» contient les noms des utilisateurs non autorisés à utiliser ftp. «root» doit être exclus comme d habitude. le fichier «/etc/shells» contient les shells des utilisateurs autorisés à utiliser ftp. Moralité : pour interdire un utilisateur à utiliser FTP : indiquer le login de la personne au niveau de «/etc/ftpusers» faire en sorte que le shell de la personne ne soit pas dans «/etc/shells» Pour configurer un FTP anonyme, se reporter aux deux FAQ : «ftp://ftp.lip6.fr/pub/doc/faqs/ftp-list/faq.gz» «ftp://ftp.lip6.fr/pub/doc/faqs/computer-security/anonymous-ftp-faq.gz» c T.Besançon (v ) Administration UNIX ARS Partie / Protocoles de transferts de fichiers FTP, TFTP 7.1 Protocole FTP, ftp, ftpd Implémentations de serveurs FTP Outre les versions fournies par les constructeurs, il y a plusieurs démons du domaine public plus performants : « « « c T.Besançon (v ) Administration UNIX ARS Partie / 468

191 7 Protocoles de transferts de fichiers FTP, TFTP 7.2 Protocole TFTP, tftp, tftpd Chapitre 7 Protocoles de transferts de fichiers FTP, TFTP 7.2 Protocole TFTP, tftp, tftpd TFTP : Trivial File Transfer Protocol Une connexion via tftp nécessite que le service TFTP soit activé au niveau de INETD ou XINETD ou tourne en mode autonome/démon. TFTP, c est en gros FTP sans pouvoir lister les directories distants et ne nécessitant pas de mot de passe pour récupérer ou déposer des fichiers! En fait, celui qui utilise TFTP sait ce qu il veut récupérer et n a pas besoin de lister le directory. Par exemple, récupération d un fichier de configuration pour : terminal X imprimante HP réseau c T.Besançon (v ) Administration UNIX ARS Partie / Protocoles de transferts de fichiers FTP, TFTP 7.2 Protocole TFTP, tftp, tftpd On récupére les fichiers dans la sous arborescence /tftpboot. Le danger : un démon tftpd mal configuré permet de récupérer tout fichier hors de «/tftpboot». Le fichier «/etc/passwd» par exemple. vérifier les options de lancement. Traditionnellement utiliser l option «-s» au niveau de «/etc/inetd.conf» : tftp dgram udp wait root /usr/etc/in.tftpd in.tftpd -s /tftpboot c T.Besançon (v ) Administration UNIX ARS Partie / 468

192 7 Protocoles de transferts de fichiers FTP, TFTP 7.2 Protocole TFTP, tftp, tftpd A noter : il est parfois utile de faire le lien symbolique suivant : # cd /tftpboot # ln -s. tftpboot car des requêtes portent parfois sur des noms du type «/tftpboot/fichier». % cd /tftpboot % ls -l drwxr-xr-x 2 root wheel 1536 Jan 4 15:02 cisco/ drwxr-sr-x 7 root wheel 512 Nov hds/ drwxr-sr-x 2 root wheel 512 Sep hp/ drwxr-sr-x 4 root wheel 512 Dec ncd/ drwxr-sr-x 2 root wheel 512 Mar plaintree/ drwxr-xr-x 2 root wheel 512 Aug sun/ lrwxrwxrwx 1 root wheel 1 May tftpboot ->. drwxr-sr-x 3 root wheel 512 Feb usr/ c T.Besançon (v ) Administration UNIX ARS Partie / Protocoles de transferts de fichiers FTP, TFTP 7.2 Protocole TFTP, tftp, tftpd Utilisation de TFTP sur CISCO pour télécharger des configurations ou des firmware. Idem sur d autres matériels réseau : par exemple marque FOUNDRY mise en place de la configuration du FOUNDRY via TFTP sur un serveur UNIX : copy tftp start router/4802.cfg reload sauvegarde de la configuration du FOUNDRY sur UNIX via TFTP : touch /tftpboot/router/downloads/x.y chown nobody:nobody /tftpboot/router/downloads/x.y copy run tftp router/downloads/x.y c T.Besançon (v ) Administration UNIX ARS Partie / 468

193 Chapitre 8 Remote Procedure Call (RPC) Explications sur le pourquoi de ce chapitre La technologie RPC est utilisée par le service de partage de fichiers NFS. C est donc un prérequis au chapitre suivant. c T.Besançon (v ) Administration UNIX ARS Partie / Remote Procedure Call (RPC) 8.1 Introduction Chapitre 8 Remote Procedure Call (RPC) 8.1 Introduction Les Remote Procedure Calls (RPC) sont nés dans les années : diminution des coûts des matériels augmentation des puissances de calcul augmentation des capacités de stockage Bref, passage à une informatique de systèmes répartis et de systèmes distribués (utiliser le matériel le plus performant pour une tâche donnée, accroître la disponibilité des systèmes). c T.Besançon (v ) Administration UNIX ARS Partie / 468

194 8 Remote Procedure Call (RPC) 8.1 Introduction Principe d un programme classique : % prog.exe int main(int argc, char *argv[]) { int n; n = coucou(); int coucou() { return(33); } exit(0); } c T.Besançon (v ) Administration UNIX ARS Partie / Remote Procedure Call (RPC) 8.1 Introduction Principe d un programme réparti utilisant une Remote Procedure Call (RPC) : % prog.exe int main(int argc, char *argv[]) { int n; n = coucou(); int coucou() { return(33); } exit(0); } RESEAU c T.Besançon (v ) Administration UNIX ARS Partie / 468

195 8 Remote Procedure Call (RPC) 8.2 Protocole External Data Representation (XDR) Chapitre 8 Remote Procedure Call (RPC) 8.2 Protocole External Data Representation (XDR) Rappel sur les processeurs : processeurs big endian (Motorola) 0x = processeurs little endian (Intel) 0x = 0x01 0x02 0x03 0x04 0x04 0x03 0x02 0xNNNN + 0xNNNN 0xNNNN + 0xNNNN + 0xNNNN 0xNNNN + 3 c T.Besançon (v ) Administration UNIX ARS Partie / Remote Procedure Call (RPC) 8.2 Protocole External Data Representation (XDR) Il existe donc deux normes de codage incompatibles! Nécessité de choisir un encodage lors des échanges RPC! C est l encodage big-endian qui a été retenu (processeurs Intel peu fréquents à l époque). Cf fonctions C «ntohs()» (network to host short), «ntohl()» (network to host long), «htons()» (host to network short), «htonl()» (host to network long). XDR permet un encodage plus général de toute donnée (structures, tableaux, etc.) Les RPC utilisent massivement XDR. c T.Besançon (v ) Administration UNIX ARS Partie / 468

196 8 Remote Procedure Call (RPC) 8.3 Modèle Client / Serveur RPC Chapitre 8 Remote Procedure Call (RPC) 8.3 Modèle Client / Serveur RPC Les Remote Procedure Calls (RPC) utilisent un modèle client / serveur : Le client est le processus qui appelle une procédure distante. Le serveur est le processus qui réalise la Remote Procedure. La communication se fait via TCP. processus local processus distant CLIENT PROCEDURE API RPC STUB client STUB serveur RPC runtime RPC runtime c T.Besançon (v ) Administration UNIX ARS Partie / Remote Procedure Call (RPC) 8.4 Localisation des procédures RPC : portmapper, rpcbind Chapitre 8 Remote Procedure Call (RPC) 8.4 Localisation des procédures RPC : portmapper, rpcbind L appel à la procédure distante nécessite de localiser la procédure distante (numéro de port TCP) sur la machine distante. Le processus «portmapper» (autre nom possible : «rpcbind») joue le rôle de serveurs de noms RPC. Il écoute sur le port TCP 111. Principe : une procédure distante s enregistre auprès du portmapper on demande au portmapper où se trouve la procédure on appelle ensuite la procédure Le processus «portmapper» (autre nom possible : «rpcbind») doit être démarré avant de lancer des processus enregistrant des RPC (voir scripts de démarrage). c T.Besançon (v ) Administration UNIX ARS Partie / 468

197 8 Remote Procedure Call (RPC) 8.4 Localisation des procédures RPC : portmapper, rpcbind 1 Ou se trouve ananas() v2? processus portmapper port ananas() v2 == port ananas(), v2 = port 3248 banane(), v2 = port 2147 banane(), v3 = port 2148 etc. processus 927 processus procedure ananas() port 3248 port 2147 port 2148 procedure banane() version 2 version 2, version 3 c T.Besançon (v ) Administration UNIX ARS Partie / Remote Procedure Call (RPC) 8.5 Liste des procédures RPC : rpcinfo Chapitre 8 Remote Procedure Call (RPC) 8.5 Liste des procédures RPC : rpcinfo (en anglais RPC information) Syntaxe : «rpcinfo [-p] [-s] hostname» Par exemple : % rpcinfo -s rpcinfo -s program version(s) netid(s) service owner ,3,4 udp,tcp,ticlts,ticotsord,ticots rpcbind super ,3,2,1 tcp,udp nlockmgr ticots,ticotsord,ticlts,tcp,udp status super ,3,2 udp,ticlts rstatd super ,2 udp,ticlts,tcp,ticotsord,ticots rusersd super udp,ticlts rquotad super ,2,1 ticots,ticotsord,tcp,ticlts,udp mountd super ,3,2 tcp,udp nfs ,2 tcp,udp nfs_acl 1... c T.Besançon (v ) Administration UNIX ARS Partie / 468

198 8 Remote Procedure Call (RPC) 8.5 Liste des procédures RPC : rpcinfo % rpcinfo -p program vers proto port service tcp 111 rpcbind tcp 111 rpcbind tcp 111 rpcbind udp 111 rpcbind udp 111 rpcbind udp 111 rpcbind udp 4045 nlockmgr udp 4045 nlockmgr udp 4045 nlockmgr udp 4045 nlockmgr tcp 4045 nlockmgr tcp 4045 nlockmgr tcp 4045 nlockmgr tcp 4045 nlockmgr... c T.Besançon (v ) Administration UNIX ARS Partie / Remote Procedure Call (RPC) 8.6 Contrôle d accès Chapitre 8 Remote Procedure Call (RPC) 8.6 Contrôle d accès Avec LINUX est apparu un contrôle d accès au niveau de PORTMAP via sa compilation avec la librairie des TCPWRAPPERS. Les contrôles d accès se font donc comme page 238 via «/etc/hosts.allow». c T.Besançon (v ) Administration UNIX ARS Partie / 468

199 Chapitre 9 Partage de fichiers NFS Explications sur le pourquoi de ce chapitre Le partage de fichiers NFS reste un protocole populaire. Il utilise les RPC (globalement on exécute des procédures «read», «write» distantes sur le serveur stockant les fichiers). c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.1 Introduction Chapitre 9 Partage de fichiers NFS 9.1 Introduction NFS = Network File System = c est l accès de façon transparente pour l utilisateur à des fichiers résidants sur des machines distantes. Actuellement, NFS version 2 la plus répandue (RFC 1094). NFS version 3 existe et est disponible mais il existe des incompatibilités d implémentations entre constructeurs (RFC 1813). NFS version 4 est en étude. Cf « ou « (RFC ). Protocole version 4 très complexe par rapport aux versions 2 et 3. c T.Besançon (v ) Administration UNIX ARS Partie / 468

200 9 Partage de fichiers NFS 9.2 Principes de NFS Chapitre 9 Partage de fichiers NFS 9.2 Principes de NFS Client Serveur System calls System calls VNODE / VFS VNODE / VFS Client routines NFS File system Server routines NFS File system RPC / XDR RPC / XDR RPC / XDR RPC / XDR Reseau c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.2 Principes de NFS On a deux aspects dans NFS : client NFS et serveur NFS. CLIENT NFS SERVEUR NFS # mount % prog rpc.statd rpc.lockd portmap (rpcbind) portmap (rpcbind) rpc.mountd nfsd rpc.statd rpc.lockd RESEAU c T.Besançon (v ) Administration UNIX ARS Partie / 468

201 9 Partage de fichiers NFS 9.2 Principes de NFS Le client NFS fait tourner les démons «biod» (ou «nfsiod»), «rpc.lockd» et «rpc.statd». CLIENT NFS # mount % prog rpc.statd rpc.lockd portmap (rpcbind) RESEAU c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.2 Principes de NFS Le serveur NFS fait tourner les démons «portmap» (ou «rpcbind»), «mountd» (ou «rpc.mountd»), «nfsd», «rpc.statd» et «rpc.lockd». SERVEUR NFS portmap (rpcbind) rpc.mountd nfsd rpc.statd rpc.lockd RESEAU c T.Besançon (v ) Administration UNIX ARS Partie / 468

202 9 Partage de fichiers NFS 9.2 Principes de NFS Montage NFS : filehandle SERVEUR NFS autorisations d exportation : /etc/exports RESEAU rpc.mountd 0 3 calcul du filehandle /home # mount server:/home /mnt /mnt + filehandle (= fh0) / CLIENT NFS kernel Filehandle = canal (au sens stdin, stdout, stderr) mais réseau c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.2 Principes de NFS Ouverture d un fichier NFS : utilisations des filehandles SERVEUR NFS Serveur NFS / port 2049 lookup(jardin, fh0) fh1 lookup(cerise.txt, fh1) fh2 kernel RESEAU % prog.exe open(/mnt/jardin/cerise.txt) kernel CLIENT NFS c T.Besançon (v ) Administration UNIX ARS Partie / 468

203 9 Partage de fichiers NFS 9.2 Principes de NFS Lecture/Ecriture d un fichier NFS : utilisations des filehandles SERVEUR NFS Serveur NFS / port 2049 read(fh2, 0, 1024) read(fh2,..., 1024) kernel RESEAU % prog.exe 2... read() / write() CLIENT NFS 1 kernel c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.2 Principes de NFS Suppression de fichier NFS ouvert exemple.txt.nfsxxxxxxxxxx rm read write c T.Besançon (v ) Administration UNIX ARS Partie / 468

204 9 Partage de fichiers NFS 9.3 Lancement de NFS sur LINUX Chapitre 9 Partage de fichiers NFS 9.3 Lancement de NFS sur LINUX Prérequis, lancer le réseau au niveau du fichier «/etc/sysconfig/network» :... NETWORKING=yes... Lancer les services NFS (rappel = scripts de démarrage dans «/etc/rc.d/init.d», commande «chkconfig», etc.) : Serveur NFS service «portmap» (le lancer en premier car les autres s enregistrent auprès de lui) service «nfslock» service «nfs» Client NFS service «portmap» (le lancer en premier car les autres s enregistrent auprès de lui) service «nfslock» c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.4 Exportation NFS sur LINUX : /etc/exports Chapitre 9 Partage de fichiers NFS 9.4 Exportation NFS sur LINUX : /etc/exports L exportation d arborescences peut être en read-only ou en read-write. L indication des arborescences à exporter du serveur NFS se fait au niveau du fichier «/etc/exports». On a la possibilité d indiquer des options différentes pour chaque exportation. Par exemple, exportation de 2 partitions, la première en lecture uniquement («ro» = read only), la seconde en lecture/écriture («rw» = read write) : /chemin/data1 /chemin/data2 client-nfs.example.com(ro) client-nfs.example.com(rw) c T.Besançon (v ) Administration UNIX ARS Partie / 468

205 9 Partage de fichiers NFS 9.4 Exportation NFS sur LINUX : /etc/exports On peut consulter la table des exportations NFS sur LINUX : # ls -l /proc/fs/nfs/exports -r--r--r-- 1 root root 0 Feb 19 19:05 /proc/fs/nfs/exports # cat /proc/fs/nfs/exports # Version 1.1 # Path Client(Flags) # IPs /local /24(rw,no_root_squash,sync,wdelay, no_subtree_check) c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.4 Exportation NFS sur LINUX : /etc/exports En cas de modification dans le fichier «/etc/exports» : il faut prévenir les démons «mountd» et «nfsd» du serveur de se reconfigurer. Commande : exportfs pour lancer les exportations de toutes les arborescences : «exportfs -a» pour arrêter les exportations de toutes les arborescences : «exportfs -u» pour passer en mode verbeux : option «-v» c T.Besançon (v ) Administration UNIX ARS Partie / 468

206 9 Partage de fichiers NFS 9.5 Exportation root NFS Chapitre 9 Partage de fichiers NFS 9.5 Exportation root NFS Le problème principal dans NFS est celui appelé équivalence root par NFS. On parle de mapping d UID dans le protocole NFS. Quels droits possède le compte root d une machine cliente NFS sur les fichiers exportés par un serveur NFS? Notamment, un fichier de droits 600 = «rw » appartenant au compte root sur le serveur NFS peut-il être lu par un compte root sur un client NFS? Est-ce le même utilisateur?? c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.5 Exportation root NFS La réponse est fonction de ce que l on veut faire (réponse de Normand...). Cela se paramètre au niveau de «/etc/exports». Deux possibilités : On décide que l UID root sur la machine cliente NFS est conservé dans la requête NFS en UID root sur le serveur NFS. On décide que l UID root sur la machine cliente NFS est converti dans la requête NFS en UID sans droit spécial sur le serveur NFS. On prend traditionnellement l UID de l utilisateur «nobody» % grep nobody /etc/passwd nobody:*:65534:65534:unprivileged user:/nonexistent:/sbin/nolo (attention : parfois c est l utilisateur «nfsnobody» mais le principe reste le même) c T.Besançon (v ) Administration UNIX ARS Partie / 468

207 9 Partage de fichiers NFS 9.5 Exportation root NFS Exemple de la conversion de l UID lors de la requête (la partition montée est exportée sans droit root NFS) : # hostname client-nfs.example.com # mount -t nfs serveur-nfs.example.com:/adm/backup/arch /mnt # id uid=0(root) gid=0(wheel) groups=0(wheel),1(daemon),2(kmem) # cd /mnt # df. Filesystem 1024-blocks Used Available Capacity Mounted on serveur-nfs.example.com:/adm/backup/arch % /mnt # ls -ld /mnt drwxrwxrwx 14 root root 512 Oct /mnt <-- tout le monde peut écrire ici # touch test.txt # ls -l total 0 -rw root daemon 75 Feb 3 15:01 motd -rw-r--r-- 1 nobody nogroup 0 Feb 3 14:59 test.txt # cat motd cat: motd: Permission denied c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.5 Exportation root NFS Exportation avec conservation de l UID root dans les requêtes NFS Option «no_root_squash» LINUX à mettre dans «/etc/exports» : /chemin/data client.example.com(rw,no_root_squash) Toujours préciser ce que l on veut («no_root_squash» ou «root_squash»). c T.Besançon (v ) Administration UNIX ARS Partie / 468

208 9 Partage de fichiers NFS 9.5 Exportation root NFS Exportation avec conversion de l UID root dans les requêtes NFS Option «root_squash» LINUX à mettre dans «/etc/exports» : /chemin/data client.example.com(rw,root_squash) Toujours préciser ce que l on veut («no_root_squash» ou «root_squash»). c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.6 Règle de non transitivité NFS Chapitre 9 Partage de fichiers NFS 9.6 Règle de non transitivité NFS Il n y a pas de transitivité NFS. Si le serveur A exporte «/partition» à la machine B Si la machine B monte «/partition» en «/partition2» et exporte «/partition2» à la machine C alors la machine C n a pas accès au contenu du «/partition» initial! Sinon il n y aurait aucune sécurité, aucun contrôle possible d exportation. Mais je crois que cela peut marcher sur LINUX (grogneugneuhhh...) A completer... c T.Besançon (v ) Administration UNIX ARS Partie / 468

209 9 Partage de fichiers NFS 9.7 Montage NFS manuel Chapitre 9 Partage de fichiers NFS 9.7 Montage NFS manuel Syntaxe usuelle : mount -t nfs serveur:/arborescence /point/montage # mount -t nfs serveur-nfs.example.com:/export/home /mnt # df /mnt Filesystem 1k-blocks Used Available Use% Mounted on serveur-nfs.example.com:/export/home % /mnt # umount /mnt c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.8 Montage NFS automatique Chapitre 9 Partage de fichiers NFS 9.8 Montage NFS automatique Les montages automatiques se règlent au niveau de /etc/fstab (ou équivalent) :... serveur-nfs.example.com:/export/home /mnt nfs hard,intr La syntaxe est celle montrée ci dessus sur tous les systèmes UNIX utilisant un fichier «/etc/fstab». Une fois le fichier «/etc/fstab» configuré, on peut faire les choses suivantes : 1 Monter une partition distante bien précise : # mount /users 2 Monter toutes les partitions distantes : # mount -t nfs -v -a c T.Besançon (v ) Administration UNIX ARS Partie / 468

210 9 Partage de fichiers NFS 9.9 Option de montage NFS soft Chapitre 9 Partage de fichiers NFS 9.9 Option de montage NFS soft Option de montage «soft» : si pour une raison ou pour une autre, les opérations RPC implantant la requête NFS viennent à échouer, cette requête NFS échoue elle aussi. On peut apparenter cette situation à celle d un disque local tombant en panne. Une manifestation de ce problème est qu il peut apparaître des blocs remplis de caractères NULL dans des fichiers nouvellement écrits à travers NFS sur une partition qui aura montré des problèmes. c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.10 Option de montage NFS hard Chapitre 9 Partage de fichiers NFS 9.10 Option de montage NFS hard Option de montage «hard» : si pour une raison ou pour une autre, les opérations RPC implantant la requête NFS viennent à échouer, cette requête NFS est soumise à nouveau et cela jusqu à ce qu elle aboutisse. On peut apparenter cette situation à celle d un disque local très lent. Pour éviter que dans le cas hard, la requête NFS ne soit transmise ad vitam eternam, on peut faire le montage en mode hard,intr ce qui autorise son interruption au clavier ou via des envois de signaux. En pratique, on utilisera toujours les montages hard,intr pour les montages des partitions auxquelles on accède en lecture/écriture. c T.Besançon (v ) Administration UNIX ARS Partie / 468

211 9 Partage de fichiers NFS 9.11 Vérification des exportations : showmount Chapitre 9 Partage de fichiers NFS 9.11 Vérification des exportations : showmount En cas de problème dans le montage NFS d une partition on peut vérifier d abord si l exportation indispensable est déjà assurée. La commande à utiliser est «showmount -e» : % showmount -e serveur-nfs.example.com Export list for serveur-nfs.example.com: /export/home.example.com /opt client-nfs.example.com /usr/local.example.com /var/mail.example.com c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.12 Vérification des exportations : rpcinfo Chapitre 9 Partage de fichiers NFS 9.12 Vérification des exportations : rpcinfo On peut aussi vérifier à distance si le serveur NFS fait tourner le démon mountd via la commande «rpcinfo». % rpcinfo -p serveur-nfs.example.com grep mount udp mountd udp mountd udp mountd tcp mountd tcp mountd tcp mountd c T.Besançon (v ) Administration UNIX ARS Partie / 468

212 9 Partage de fichiers NFS 9.13 Messages d erreur NFS Chapitre 9 Partage de fichiers NFS 9.13 Messages d erreur NFS Un message d erreur classique prend la forme : NFS write error: on host serveur-nfs.example.com remote file system full Parfois c est hermétique comme message : NFS write error 60 on host nfs-client.example.com fh a0000 cdbe 66b10eac a0000 1d00 5fdbece5 Pour le décoder, se reporter à «<sys/errno.h>». Ici on déduit : #define ETIMEDOUT 60 /* Connection timed out */ c T.Besançon (v ) Administration UNIX ARS Partie / Partage de fichiers NFS 9.14 Annexe 1 Chapitre 9 Partage de fichiers NFS 9.14 Annexe 1 Ci joint dans la version imprimée de ce cours, un document sur le protocole NFS. NFS_Protocol_Sequence_Diagram.pdf c T.Besançon (v ) Administration UNIX ARS Partie / 468

213 Network File System Protocol (NFS Protocol Sequence Diagram) Client Server NFS Client NFS Server Application Client Shell Port Mapper Mountd Daemon NFSD Daemon EventStudio System Designer Aug-07 22:47 (Page 1) This diagram was generated with EventStudio System Designer 4.0. ( Copyright EventHelix.com Inc. All Rights Reserved. This sequence diagram describes mounting, opening and reading of a file via the NFS (Network File System). Server Startup Open Port 111 for UDP and TCP 1:Register Program Number = , Mount Port Number The port mapper starts listening on UDP and TCP port 111 for program number to port mapping requests from NFS clients. Mountd, the daemon handling NFS mounts, registers its port number for receiving mount requests (Program number ). 2:Register Program Number = , NFS Port Number The NFS server registers its port number for NFS (Program number ). NFS Mount Mount server:/user/bill as /nfs/bill 1:mount() server:/user/bill, /nfs/bill 2:RPC get_port request port_num = 111, Program Number = :RPC get_port reply Mount Port Number Invoke the mount system call to associate the NFS handle with a local mount point 6:returns SUCCESS 4:RPC MOUNT request port_num = 111, Mount Port Number, server:/user/bill, /nfs/bill 5:RPC MOUNT reply File Handle for the File System Authenticate Client Perform a local mount for the requested file system. Application initiates mounting of a file system. The mount API for the OS is invoked. The mount applications port number on the server is determined from the port mapper. The request includes the program number for mount. Remote Procedure Call (RPC) is used as the underlying protocol for this access. All message interactions in NFS are performed as remote procedure calls. Request the NFS server to mount the file system. Mountd uses the client IP address to authenticate the client. The server performs a local mount. The results of this mount command will be passed back to the NFS client. Communicate the mount results to the NFS client. The remote mount information is saved along with the local mount point. Report a successful mount to the application. Opening a File Read file: /nfs/bill/public/ nfs_tutorial.txt 1:open() file = /nfs/bill/public/ nfs_tutorial.txt, mode = read only

214 Network File System Protocol (NFS Protocol Sequence Diagram) Client Server NFS Client NFS Server Application Client Shell Port Mapper Mountd Daemon NFSD Daemon Process file name to find that /nfs/bill is NFS mounted 2:RPC get_port request port_num = 111, Program Number = EventStudio System Designer Aug-07 22:47 (Page 2) Application uses the OS API to open a file. The application is not aware that the requested file is remotely located and needs to accessed via NFS. The OS parses the file to find that the file /nfs/bill is a NFS mounted file system. This initiates the NFS file open processing. Since the file is located on an NFS Server, the client initiates access by requesting the port number for the NFS service. 3:RPC get_port reply NFS Port Number The client hierarchically parses the path and obtains a file handle. NFS commands GETATTR and LOOKUP are used to perform these operations. GETATTR returns the attributes of a file, for example permissions, file size, file owner. LOOKUP looks up the file and returns a file handle to the file. 4:RPC GETATTR Request /user/bill 5:RPC GETATTR Reply status = OK Get the file attributes for the /user/bill directory. 6:RPC LOOKUP Request /user/bill/public Check if the public directory is available. 7:RPC LOOKUP Reply status = OK 8:RPC LOOKUP Request /user/bill/public/ nfs_tutorial.txt Check if the requested file exists. 9:RPC LOOKUP Reply status = OK 10:RPC GETATTR Request /user/bill/public/ nfs_tutorial.txt Obtain the attributes about the requested file. 11:RPC GETATTR Reply status = OK 12:returns File Pointer Reading File Contents 1:read() The application initiates a read for the file. This file is on a NFS mounted file system, so the read command is performed via NFS READ RPC calls. The READ calls specify the file name, the starting offset in each request. The NFS server reads the standard block size and returns to the client. Note that the NFS server itself is stateless and handles each request independently. The client maintains the application and performs multiple reads to complete the file read. The following interaction shows how the client performs multiple reads to complete the file reading operation. 2:RPC READ Request /user/bill/public/ nfs_tutorial.txt, start_offset = 0 3:RPC READ Reply status = OK, read_bytes = 1024 Read from the file start (offset 0)

215 Network File System Protocol (NFS Protocol Sequence Diagram) Client Server NFS Client NFS Server Application Client Shell Port Mapper Mountd Daemon NFSD Daemon 4:RPC READ Request /user/bill/public/ nfs_tutorial.txt, start_offset = 1024 EventStudio System Designer Aug-07 22:47 (Page 3) Initiate the read from the offset 1024 (0 to 1023 byte offsets have already been read). 5:RPC READ Reply status = OK, read_bytes = :RPC READ Request /user/bill/public/ nfs_tutorial.txt, start_offset = 2048 Read the remaining part of the file. 7:RPC READ Reply status = OK, read_bytes = 20 8:returns

216 9 Partage de fichiers NFS 9.15 Annexe 2 Chapitre 9 Partage de fichiers NFS 9.15 Annexe 2 Ci joint dans la version imprimée de ce cours, un document sur NFS écrit par NETWORK-APPLIANCE (technical report TR 3183). (copie sur site c T.Besançon (v ) Administration UNIX ARS Partie / 468

217 TECHNICAL REPORT Using the Linux NFS Client with Network Appliance Filers Getting the Best from Linux and Network Appliance Technologies Chuck Lever, Network Appliance August 2004 TR 3183 TECHNICAL REPORT Network Appliance, a pioneer and industry leader in data storage technology, helps organizations understand and meet complex technical challenges with advanced storage solutions and global data management Abstract This report helps you get the best from your Linux NFS clients when used in an environment that includes Network Appliance filers. You will learn what level of performance to expect from your Linux systems. You will learn how to tune your Linux clients and diagnose performance and reliability problems. Finally, you will learn where to look for more information when faced with problems that are difficult to diagnose. Filer tuning information specific to your application may be available in other Network Appliance technical reports. This document is appropriate for customers, systems engineers, technical marketing engineers, and customer support personnel who install and configure Linux systems that use NFS to access Network Appliance filers and network caches. Network Appliance Inc. 1

218 TECHNICAL REPORT Table of Contents Abstract 1 1) Typographic Conventions 3 2) Introduction 4 3) Which Linux NFS Client Is Right for Me? 5 3.1) Identifying Kernel Releases 5 3.2) Today s Linux distributions 5 3.3) The NFS Client in the 2.4 Kernel 6 3.4) The NFS Client in the 2.6 Kernel 7 4) Foolproof Mount Options for Linux NFS Clients 8 4.1) Choosing a Network Transport Protocol 9 4.2) Capping the Size of Read and Write Operations ) Special Mount Options ) Tuning NFS client cache behavior ) Unmounting NFS File Systems ) Mount Option Examples 15 5) Performance ) Linux NFS client performance ) Linux NFS Client Architecture ) Diagnosing Performance Problems with the Linux NFS Client ) Error Messages in the Kernel Log ) Oops ) Getting Help 23 6) Other Sundries ) Telling Time ) Security ) Network Lock Manager ) Using the Linux Automounter 27 Network Appliance Inc. 2

219 TECHNICAL REPORT 6.5) Net booting your Linux NFS clients 28 7) Executive Summary 30 8) Appendix ) Related Material ) Special network settings ) Controlling File Read-Ahead in Linux ) How to Enable Trace Messages ) How to Enable Uncached I/O on RHEL AS ) Typographic Conventions Linux and filer commands and file names appear in Courier New. Summary information appears in red italicized type at the end of each section, and an executive summary appears at the end of the document. Network Appliance Inc. 3

220 TECHNICAL REPORT 2) Introduction More and more Network Appliance customers recognize the value of Linux in their enterprises. Historically, the Linux NFS client has trailed the rest of Linux in providing the level of stability, performance, and scalability that is appropriate for enterprise workloads. In recent times, however, the NFS client has improved considerably and continues to improve in performance and ability to work under degraded network conditions. This document addresses several areas that concern those who are planning a new Linux deployment or are administering an existing environment that contains Linux NFS clients accessing Network Appliance filers. These areas include: What level of performance and stability to expect from Linux NFS clients How to tune Linux NFS clients to perform well with filers How to diagnose client and network problems that involve Linux NFS clients Which network interfaces and drivers work best How to configure other services required to provide advanced NFS features Where to find more tuning and diagnostic information on NOW (NetApp on the Web) and the Internet Except for clients running Oracle databases, Network Appliance does not recommend specific Linux kernel releases or distributions. First, there are too many distributions and releases to qualify them all. There are more than a half-dozen distributions and thousands of different kernel releases. Add to that the many different versions of user-level helper applications (such as the mount command). Because all of these are open source, you can modify or replace any part of your client. Second, many hardware and application vendors specify a small set of releases or a single release and distribution that are certified and supported. It would be confusing for us to recommend one particular kernel or distribution when your hardware vendor recommends another, and your application vendor specifies yet a third. Finally, some applications are more sensitive to NFS client behavior than others. Recommending a particular Linux NFS client depends on the applications you want to run and what your performance and reliability requirements are. Therefore, instead of recommending one or two releases that work well, we provide some guidelines to help you decide among the many Linux distributions and releases, and we provide advice on how to make your Linux NFS clients work their best. Network Appliance Inc. 4

221 TECHNICAL REPORT 3) Which Linux NFS Client Is Right for Me? Before we begin our focus on technical issues, we cover some basic technical support challenges specific to Linux. The Linux NFS client is part of the Linux kernel. Because Linux is open source, you might think that it is easy to provide Linux kernel patches to upgrade the NFS client. In fact, providing a patch that fixes a problem or provides a new feature can be complicated by several facts of life in the Linux and open source worlds. There are many different parts to Linux, but the two we are concerned about are the distribution and the kernel. The distribution is the set of base operating system files that are included when customers install a Red Hat or SUSE Linux distribution on their hardware. This includes commands, applications, and configuration files. A kernel comes with a distribution, but customers can replace it, usually without affecting other files provided in the distribution. 3.1) Identifying Kernel Releases The version number of a Linux distribution and the release number of a Linux kernel use different naming schemes. While planning a distribution, each distributor chooses a particular kernel release (for example, 2.6.5) and adds some modifications of its own before placing the kernel into a distribution. To reduce the amount of variation they encounter in their support contracts, distributors support only a small set of kernels, most of which are carefully designed for a specific distribution. Because the NFS client is part of the kernel, updates to the NFS client require that you replace the kernel. Technically, it is easy to replace a kernel after a distribution is installed, but Linux customers risk losing distributor support for their installation if they install a kernel that was not built by the distributor. For this reason, Network Appliance does not recommend specific patches or kernel versions,. Often support contracts constrain customers so they cannot install a patch until their chosen distributor provides a supported kernel that includes the recommended patch. The current kernel is released in two branches, known as the stable branch and the development branch. The stable branch is ostensibly the branch that is hardened, reliable, and has unchanging program interfaces, while the development branch can be (and often is) unstable and sometimes is even unbootable. Stable branches have even minor release numbers, such as 2.4, while development branches have odd minor release numbers, such as 2.5. The previous stable branch is 2.4, the latest stable branch is 2.6, and the latest development branch has not been opened yet. Linux kernels are not published on a time-based schedule. Kernel revisions are released when the branch maintainer decides they are ready. New features and API changes are allowed in development kernels, but there is no schedule for when such a kernel will become a stable release. Development branches have historically taken two years to 30 months to become stable branches. It is for this reason that there is often significant pressure to add new features to stable releases instead of working them into development releases. 3.2) Today s Linux distributions As mentioned above, distributions are numbered differently than kernels. Each distributor chooses its own numbering scheme. When describing your Linux environment to anyone, be sure to list both the distribution release number, and the kernel release number. Distributors usually append another number on the end of their kernel versions to indicate which revision of that kernel is in use. For instance, Red Hat shipped kernel with its 7.3 distribution, but made several other errata kernels available over time to fix problems in its kernel: , , , and so on. Network Appliance Inc. 5

222 TECHNICAL REPORT An important trend in commercial Linux distributions is the existence of enterprise Linux distributions. Enterprise distributions are quality-assured releases that come with special support contracts. Not all customers need this level of support, however. Red Hat recently changed its product line, dropping the professional series of distributions in favor of an openly maintained distribution called Fedora. Fedora is intended for developers and customers who can tolerate some instability on their Linux systems. SUSE continues to sell an enterprise distribution as well as a less expensive desktop product line. Distributions such as Mandrake and Debian are ideal for customers looking for a low-cost general purpose Linux distribution. Network Appliance recommends that its Linux customers always use the latest actively maintained distributions available. Customers running older unsupported distributions no longer get the benefits of security fixes and quick bug fixes on their Linux clients. Most Linux distributors will not address bugs in older distributions at all. Especially if your clients and filers are not protected by a firewall, it is important for you to stay current with the latest errata patches available for your distribution. To find out which kernel your clients run, you can use this command: % uname r ELsmp % Kernels built from community source code usually have only three or four dot-separated numbers, like Distributors generally add a hyphen and more version numbers (in this case, ), which indicate that additional patches over the community source base have been applied. The keyword on the end, such as hugemem or smp, show hardware capabilities this kernel was built for. 3.3) The NFS Client in the 2.4 Kernel The NFS client in this kernel has many improvements over the older 2.2 client, most of which address performance and stability problems. The NFS client in kernels later than has significant changes to help improve performance and stability. Customers that use 2.4 kernels on hardware with more than 896MB should know that a special kernel compile option, known as CONFIG_HIGHMEM, is required for the system to access and use physical memory above 896MB. The Linux NFS client has a known problem in these configurations where an application or the whole client system can hang at random. This issue has been addressed in the kernel, but still haunts kernels contained in distributions from Red Hat and SUSE that are based on earlier kernels. Early releases of Red Hat 7.3 contained a kernel that demonstrated very poor NFS client performance when mounting with UDP. Recent errata kernels that fix some of the performance problems are available from Red Hat. Network Appliance recently published an article describing this problem in detail. See document ntapcs6648 on NOW (the URL is in the appendix). Earlier kernels in the 2.4 series may have some problems when using NFS over TCP. The latest distributions (SUSE SLES 8 SP3, Fedora Core 1, and RHEL 3.0) use kernels that have a robust NFS over TCP implementation. Kernels older than can suffer from problems with NFS over TCP that result from lossy networks and overloaded NFS servers. The problem, documented in BURT 96021, can cause mount points to become unusable until the client is rebooted. No matter which 2.4 kernel your distribution uses, you should always start with NFS over TCP first, as TCP has a number of important benefits over UDP. Network Appliance Inc. 6

223 TECHNICAL REPORT 3.4) The NFS Client in the 2.6 Kernel During 2004, we expect distributions based on the 2.6 kernel to appear and become stable enough to be employed in production environments. SUSE SLES 9 is the first enterprise Linux distribution to use 2.6, and Fedora Core 2 is also a 2.6-based distribution that is available now. A new feature in the 2.6 kernel is support for the latest version of the NFS protocol, version 4. Developers are still in the process of retrofitting the Linux NFS client and server implementations for the new protocol version. Certain features are available today in 2.6 kernels, but others, such as read and write delegation and replication and migration support, are still under development and are not yet available. Support for NFS version 4 is available now in Fedora Core 2. The 2.6 kernel also brings support for advanced authentication mechanisms such as Kerberos 5. Support for Kerberos will be available for NFS versions 2 and 3 as well as for NFS version 4. Kerberos authentication increases security by reducing the likelihood that user identities in NFS requests can be forged. It also provides optional facilities to ensure the integrity or privacy of communication between an NFS client and server. The NFS client in the 2.6 kernel has demonstrated superior performance and stability over older Linux NFS clients, but as usual customers should be cautious about moving their production workloads onto this very new release of the Linux kernel. In summary: You should use the latest distribution and kernel available from your distributor when installing a new deployment, and attempt to keep your existing Linux clients running the latest updates from your distributor. Always check with your hardware and application vendors to be certain they support the kernel you choose to run. Contact us if you have any special requirements. Network Appliance Inc. 7

224 TECHNICAL REPORT 4) Foolproof Mount Options for Linux NFS Clients If you have never set up mount options on an NFS client before, review the nfs man page on Linux to see how these terms are defined. You can type man nfs at a shell prompt to display the page. In addition, O Reilly s Managing NFS & NIS covers many of these topics (see the appendix for URL and ISBN information). You can look in /etc/fstab on the client to see what options the client attempts to set when mounting a particular file system. Check your automounter configuration to see what defaults it uses when mounting. Running the mount command at a shell prompt tells you what options are actually in effect. Clients negotiate some options, for example, the rsize option, with servers. Look in the client s /proc/mounts file to determine exactly what mount options are in effect for an existing mount. The default NFS protocol version (2 or 3) used when mounting an NFS server can change depending on what protocols the server exports, which version of the Linux kernel is running on the client, and what version of the mount utilities package is in use. Version 2 of NFS is the default. To ensure that your client uses the NFSv3 protocol, you should specify vers=3 when mounting a filer. Be sure that the NFSv3 protocol is enabled on your filer before trying to mount using vers=3 by using the options nfs command on your filer s console. The hard mount option is the default on Linux and is mandatory if you want data integrity. Using the soft option reduces the likelihood of client instability during server and network outages, but it exposes your applications to silent data corruption, even if you mount file systems read-only. If a soft timeout interrupts a read operation, the client s cached copy of the file is probably corrupt. To purge a corrupted file requires that some application locks and unlocks the file, that the whole file system is unmounted and remounted, or that another client modifies the file s size or mtime. If a soft timeout interrupts a write operation, there is no guarantee that the file on the server is correct, nor is there any guarantee that the client s cached version of the file matches what is on the server. A client can indicate that a soft timeout has occurred in various ways. Usually system calls return EIO when such a timeout occurs. You may also see messages in the kernel log suggesting that the client had trouble maintaining contact with the server and has given up. If you see a message that says the client is still trying, then the hard mount option is in effect. As an alternative to soft mounting, consider using the intr option, which allows users and applications to interrupt the NFS client when it gets stuck waiting for server or network recovery. On Linux, interrupting applications or mount commands does not always work, so sometimes rebooting your client is necessary to recover a mount point that has become stuck because the server is not available. When running applications such as databases that depend on end-to-end data integrity, you should use hard,nointr. Oracle has verified that using intr instead of nointr can expose your database to the risk of corruption when a database instance is signaled (for example, during a shutdown abort sequence). The soft option is useful only in a small number of cases. If you expect significant server or network instability, try using the soft option with TCP to help reduce the impact of temporary problems. When using the soft option, especially with UDP, set a long retransmission timeout and a relatively large number of retries. This reduces the likelihood that very brief outages or a few dropped packets will cause an application failure or data corruption. Network Appliance Inc. 8

225 TECHNICAL REPORT 4.1) Choosing a Network Transport Protocol Although UDP is a simple transport protocol that has less CPU and network overhead than TCP, NFS over UDP has deficiencies that are exposed on congested networks, such as routed multispeed networks, DSL links, and slow WANs, and should never be used in those environments. Unless you need the very last bit of performance from your network, TCP is a safe bet, especially on versions of Linux more recent than Future versions of the NFS protocol may not support UDP at all, so it is worth planning a transition to TCP now if you still use primarily NFS over UDP. NFS over TCP can handle multispeed networks (networks where the links connecting the server and the client use different speeds), higher levels of packet loss and congestion, fair bandwidth sharing, and widely varying network and server latency, but can cause long delays during server recovery. (Note that older releases of Data ONTAP do not enable NFS over TCP by default. From Data ONTAP release 6.2 onward, TCP connections are enabled automatically on new filer installations). Although TCP has slightly greater network and CPU overhead on both the client and server, you will find that NFS performance on TCP remains stable across a variety of network conditions and workloads. If you find UDP suits your needs better and you run kernels older than , be sure to enlarge your client s default socket buffer size by following the instructions listed in the appendix of this guide. Bugs in the IP fragmentation logic in these kernels can cause a client to flood the network with unusable packets, preventing other clients from accessing the filer. The Linux NFS client is especially sensitive to IP fragmentation problems that can result from congested networks or undersized switch port buffers. If you think IP fragmentation is an issue for your clients using NFS over UDP, the netstat s command on the client and on the filer will show continuous increases in the number of IP fragmentation errors. Be certain to apply this change to all Linux NFS clients on your network in order for the change to be completely effective. Note that RHEL 3.0 s kernels, even though based on pre1, already set transport socket buffer sizes correctly. In Linux kernels older than , a remote TCP disconnect (for example during a cluster failover) occasionally may cause a deadlock on the client that makes a whole mount point unusable until the client is rebooted. There is no workaround other than upgrading to a version of the Linux kernel where this issue is addressed. For databases, we recommend NFS over TCP. There are rare cases where NFS over UDP on noisy or very busy networks can result in silent data corruption. In addition, Oracle9i RAC is certified on NFS over TCP running on Red Hat Advanced Server 2.1. You can control RPC retransmission timeouts with the timeo option. Retransmission is the mechanism by which clients ensure a server receives and processes an RPC request. If the client does not receive a reply for an RPC within a certain interval for any reason, it retransmits the request until it receives a reply from the server. After each retransmission, the client doubles the retransmit timeout up to 60 seconds to keep network load to a minimum. By default, the client retransmits an unanswered UDP RPC request after 0.7 seconds. In general, it is not necessary to change the retransmission timeout for UDP, but in some cases, a shorter retransmission timeout for NFS over UDP may shorten latencies due to packet losses. As of kernel , an estimation algorithm that adjusts the timeout for optimal performance governs the UDP retransmission timeout for some types of RPC requests. Network Appliance Inc. 9

226 TECHNICAL REPORT The Linux NFS client quietly retransmits RPC requests several times before reporting in the kernel log that it has lost contact with an NFS server. You can control how many times the client retransmits the same request before reporting the loss using the retrans mount option. Remember that whenever the hard mount option is in effect, an NFS client never gives up retransmitting an RPC until it gets a reply. Be careful not to use the similar-sounding retry mount option, which controls how long the mount command retries a backgrounded mount request before it gives up. Retransmission for NFS over TCP works somewhat differently. The TCP network protocol contains its own timeout and retransmission mechanism that ensures packets arrive at the receiving end reliably and in order. The RPC client depends on this mechanism for recovering from the loss of RPC requests and thus uses a much longer timeout setting for NFS over TCP by default. Due to a bug in the mount command, the default retransmission timeout value on Linux for NFS over TCP is six seconds, unlike other NFS client implementations. To obtain standard behavior, you may wish to specify timeo=600,retrans=2 explicitly when mounting via TCP. Unlike with NFS over UDP, using a short retransmission timeout with NFS over TCP does not have performance benefits and may increase the risk of data corruption. In summary, timeo=600,retrans=2 is appropriate for TCP mounts. When using NFS over UDP, timeo=4,retrans=9 is a better choice. Using timeo=600 with NFS over UDP will result in very poor performance in the event of network or server problems. Using a very short timeout with TCP could cause network congestion or, in rare cases, data corruption. 4.2) Capping the Size of Read and Write Operations In Linux, the rsize and wsize mount options have additional semantics compared with the same options as implemented in other operating systems. Normally these options determine how large a network read or write operation can be before the client breaks it into smaller operations. Low rsize and wsize values are appropriate when adverse network conditions prevent NFS from working with higher values or when NFS must share a low-bandwidth link with interactive data streams. By default, NFS client implementations choose the largest rsize and wsize values a server supports. However, if you do not explicitly set rsize or wsize when you mount an NFS file system on a Red Hat NFS client, the default value for both is a modest 4,096 bytes. Red Hat chose this default because it allows the Linux NFS client to work without adjustment in most environments. Usually on clean highperformance networks, or with NFS over TCP, you can improve NFS performance by explicitly increasing these values. Normally, the Linux client caches application write requests, issuing NFS WRITE operations when it has at least wsize bytes to write. The NFS client often returns control to a writing application before it has issued any NFS WRITE operations. It also issues NFS READ operations in parallel before waiting for the server to reply to any of them. If the rsize option is set below the system s page size (4KB on x86 hardware), the NFS client in 2.4 kernels issues individual read operations one at a time and waits for each operation to complete before issuing the next read operation. If the wsize option is set below the system s page size, the NFS client issues synchronous writes without regard to the use of the sync or async mount options. As with reads, synchronous writes cause applications to wait until the NFS server completes each individual write operation before issuing the next operation or before letting an application continue with other processing. When performing synchronous writes, the client waits until the server has written its data to stable storage before allowing an application to continue. Network Appliance Inc. 10

227 TECHNICAL REPORT Some hardware architectures allow a choice of different page sizes. Intel Itanium systems, for instance, support pages up to 64KB. On a system with 64KB pages, the rsize and wsize limitations described above still apply; thus all NFS I/O is synchronous on these systems, significantly slowing read and write throughput. When running on hardware that supports different page sizes, choose a combination of page size and r/wsize that allows the NFS client to do asynchronous I/O if possible. Usually distributors choose a single large page size, such as 16KB, when they build kernels for hardware architectures that support multiple page sizes. This limitation has been removed in 2.6 kernels so that all read and write traffic is asynchronous whenever possible, independent of the rsize and wsize settings. The network transport protocol (TCP or UDP) interacts in complicated ways with rsize and wsize. When you encounter poor performance because of network problems, using NFS over TCP is a better way to achieve good performance than using small read or write sizes over UDP. The NFS client and server fragment large UDP datagrams, such as single read or write operations more than a kilobyte in size, into individual IP packets. RPC over UDP retransmits a whole RPC request if any part of it is lost on the network, whereas RPC over TCP efficiently recovers a few lost packets and reassembles the complete requests at the receiving end. Thus with NFS over TCP, 32KB read and write size usually provides good performance by allowing a single RPC to transmit or receive a large amount of data. With NFS over UDP, 32KB read and write size may provide good performance, but often using NFS over UDP results in terrible performance if the network is at all congested. For Linux, a good compromise value when using NFS over UDP is 8KB or less. If you find even that does not work well, and you cannot improve network conditions, we recommend switching to NFS over TCP if possible. It is very important to note that the capabilities of the Linux NFS server are different from the capabilities of the Linux NFS client. As of the kernel release, the Linux NFS server does not support NFS over TCP and does not support rsize and wsize larger than 8KB. The Linux NFS client, however, supports NFS over both UDP and TCP and rsize and wsize up to 32KB. Some online documentation is confusing when it refers to features that Linux NFS supports. Usually such documentation refers to the Linux NFS server, not the client. Check with your Linux distributor to determine whether their kernels support serving files via NFS over TCP. 4.3) Special Mount Options Consider using the bg option if your client system needs to be available even if it cannot mount some servers. This option causes mount requests to put themselves in the background automatically if a mount cannot complete immediately. When a client starts up and a server is not available, the client waits for the server to become available by default. The default behavior, which you can adjust with the retry mount option, results in waiting for almost a week before giving up. The fg option is useful when you need to serialize your mount requests during system initialization. For example, you probably want the system to wait for /usr to become available before proceeding with multiuser boot. If you mount /usr or other critical file systems from an NFS server, you should consider using fg for these mounts. The retry mount option has no effect on foreground mounts. A foreground mount request will fail immediately without any retransmission if any problem occurs. For security, you can also use the nosuid mount option. This causes the client to disable the special bits on files and directories. The Linux man page for the mount command recommends also disabling or removing the suidperl command when using this option. Note that the filer also has a nosuid export option, which does roughly the same thing for all clients accessing an export. Interestingly, the filer s nosuid export option also disables the creation of special devices; if you notice programs that Network Appliance Inc. 11

228 TECHNICAL REPORT use special sockets and devices (such as screen ) behaving strangely, check for the nosuid export option on your filers. It is a common trick for system administrators to use the noatime mount option on local file systems to improve overall file system performance by preventing a synchronous metadata update every time a file is accessed in some way. Because NFS servers, not clients, control the values contained in a file s timestamps (access time, metadata change time, and data modify time) by default, this trick is not effective for NFS mounts. However, filers allow you to reduce the overhead caused by aggressive atime flushing if you set a volume s no_atime_update option. On a filer console, type help vol options for details. 4.4) Tuning NFS client cache behavior Other mount options allow you to tailor the client s attribute caching and retry behavior. It is not necessary to adjust these behaviors under most circumstances. However, sometimes you must adjust NFS client behavior to make NFS appear to your applications more like a local file system, or to improve performance for metadata-intensive workloads. There are a few indirect ways to tune client-side caching. First, the most effective way to improve client-side caching is to add more RAM to your clients. Linux will make appropriate use of the new memory automatically. To determine how much RAM you need to add, determine how large your active file set is and increase RAM to fit. This greatly reduces cache turnover rate. You should see fewer read requests and faster client response time as a result. Some working sets will never fit in a client s RAM cache. Your clients may have 128MB or 4GB of RAM, for example, but you may still see significant client cache turnover. In this case, reducing cache miss latency is the best approach. You can do this by improving your network infrastructure and tuning your server to improve its performance. Because a client-side cache is not effective in these cases, you may find that keeping the client s cache small is beneficial. Normally, for each file in a file system that has been accessed recently, a client caches file attribute information, such as a file s last modification time and size. To detect file changes quickly yet efficiently, the NFS protocol uses close-to-open cache semantics. When a client opens a file, it uses a GETATTR operation to check that the file still exists and any cached data it has is still up-to-date. A client checks back with the server only after a timeout indicates that the file s attributes may be stale. During such a check, if the server s version of the attributes has changed, the client purges its cache. A client can delay writes to a file indefinitely. When a client closes a file, however, it flushes all pending modifications to the file to the server. This allows a client to provide good performance in most cases, but means it might take some time before an application running on one client sees changes made by applications on other clients. You may also want to improve attribute caching behavior. Due to the requirements of the NFS protocol, clients must check back with the server every so often to be sure cached attribute information is still valid. However, adding RAM on the client will not improve the rate at which the client tries to revalidate parts of the directory structure it has already cached. No matter how much of the directory structure is cached on the client, it must still validate what it knows when files are opened or when attribute cache information expires. You can lengthen the attribute cache timeout with the actimeo mount option to reduce the rate at which the client tries to revalidate its attribute cache. With the kernel release, you can also use the nocto mount option to reduce the revalidation rate even further, at the expense of cache coherency among multiple clients. The nocto mount option is appropriate for read-only Network Appliance Inc. 12

229 TECHNICAL REPORT mount points where files change infrequently, such as a lib, include, or bin directory, static HTML files or, image libraries. In combination with judicious settings of actimeo you can significantly reduce the number of on-the-wire operations generated by your NFS clients. Be careful to test this setting with your application to be sure that it will tolerate the delay before the NFS client notices file changes and fetches the new versions from the server. The Linux NFS client delays application writes to combine them into larger, more efficiently processed requests. You can guarantee that a client immediately pushes every write system call an application makes to servers by using the sync mount option. This is useful when an application needs the guarantee that data is safe on disk before it continues. Frequently such applications already use the O_SYNC open flag or invoke the flush system call when needed. Thus, the sync mount option is often not necessary. Delayed writes and the client s attribute cache timeout can delay detection of changes on the server by many seconds while a file is open. The noac mount option prevents the client from caching file attributes. This means that every file operation on the client that requires file attribute information results in a GETATTR operation to retrieve a file s attribute information from the server. Note that noac also causes a client to process all writes to that file system synchronously, just as the sync mount option does. Disabling attribute caching is only one part of noac; it also guarantees that data modifications are visible on the server so that other clients using noac can detect them immediately. Thus noac is shorthand for actimeo=0,sync. When the noac option is in effect, clients still cache file data as long as they detect that a file has not changed on the server. The noac mount option allows a client to keep very close track of files on a server so it can discover changes made by other clients quickly. Normally you will not use this option, but it is important when an application that depends on single system behavior is deployed across several clients. Using the noac mount option causes a 40% performance degradation on typical workloads, but some common workloads, such as sequential write workloads, can be impacted by up to 70%. Database workloads that consist of random reads and writes are generally less affected by noac. Noac generates a very large number of GETATTR operations and sends write operations synchronously. Both of these add significant protocol overhead. The noac mount option trades off single-client performance for client cache coherency. Only applications that need tight cache coherency among multiple clients require that file systems be mounted with the noac mount option. Some applications require direct, uncached access to data on a server. Using the noac mount option is sometimes not good enough, because even with this option, the Linux NFS client still caches reads. To ensure your application sees the server s version of a file s data and not potentially stale data cached by the client, your application can lock and unlock the file. This pushes all pending write operations back to the server and purges any remaining cached data, so the next read operation will go back to the server rather than reading from a local cache. Alternatively, the Linux NFS client in the RHEL 3.0 and SUSE SLES 8 SP3 kernels supports direct I/O to NFS files when an application opens a file with the O_DIRECT flag. Direct I/O is a feature designed to benefit database applications that manage their own data cache. When this feature is enabled, an application s read and write system calls are translated directly into NFS read and write operations. The Linux kernel never caches the results of any read or write when a file is opened with this flag, so applications always get exactly what s on the server. Because of I/O alignment restrictions in some Network Appliance Inc. 13

230 TECHNICAL REPORT versions of the Linux O_DIRECT implementation, applications must be modified to support direct I/O properly. See the appendix for more information on this feature, and its equivalent in RHEL AS 2.1, uncached I/O. For some servers or applications, it is necessary to prevent the Linux NFS client from sending Network Lock Manager requests. You can use the nolock mount option to prevent the Linux NFS client from notifying the server s lock manager when an application locks a file. Note, however, that the client still flushes its data cache and uses more restrictive write back semantics when a file lock is in effect. The client always flushes all pending writes whenever an application locks or unlocks a file. For detailed information on configuring and tuning filers in an Oracle environment, see the Network Appliance tech reports at 4.5) Unmounting NFS File Systems This section discusses the unique subtleties of unmounting NFS file systems on Linux. Like other *NIX operating systems, the umount command detaches one or more file systems from a client s file system hierarchy (like several other common features of *NIX, the name of this command is missing a letter). Normally, you use umount with no options, specifying only the path to the root of the file system you want to unmount. Sometimes you might want more than a standard unmount operation for example, when programs appear to be waiting indefinitely for access to files in a file system or if you want to clear the client s data cache. If you want to unmount all currently mounted NFS file systems, use: umount a t nfs Sometimes unmounting a file system becomes stuck. For this, there are two options: umount f /path/to/filesystem/root This forces an unmount operation to occur if the mounted NFS server is not reachable, as long as there are no RPC requests on the client waiting for a reply. After kernel , umount l /path/to/filesystem/root usually causes the kernel to detach a file system from the client s file system hierarchy immediately, but allows it to clean up RPC requests and open files in the background. As mentioned above, unmounting an NFS file system does not interrupt RPC requests that are awaiting a server reply. If an umount command fails because processes are waiting for network operations to finish, you must interrupt each waiting process using ^C or an appropriate kill command. Stopping stuck processes usually happens automatically during system shutdown so that NFS file systems can be safely unmounted. The NFS client allows these methods to interrupt pending RPC requests only if the intr option is set for that file system. To identify processes waiting for NFS operations to complete, use the lsof command. The umount command accepts file system specific options via the -o flag. The NFS client does not have any special options. Network Appliance Inc. 14

231 TECHNICAL REPORT 4.6) Mount Option Examples We provide the following examples as a basis for beginning your experimentation. Start with an example that closely matches your scenario, then thoroughly test the performance and reliability of your application while refining the mount options you have selected. On older Linux systems, if you do not specify any mount options, the Linux mount command (or the automounter) automatically chooses these defaults: mount o rw,fg,vers=2,udp,rsize=4096,wsize=4096,hard,intr, timeo=7,retrans=5 These default settings are designed to make NFS work right out of the box in most environments. Almost every NFS server supports NFS version 2 over UDP. Rsize and wsize are relatively small because some network environments fragment large UDP packets, which can hurt performance if there is a chance that fragments can be lost. The RPC retransmit timeout is set to 0.7 seconds by default to accommodate slow servers and networks. On clean single-speed networks, these settings are unnecessarily conservative. Over some firewalls, UDP packets larger than 1,536 are not supported, so these settings do not work. On congested networks, UDP may have difficulty recovering from large numbers of dropped packets. NFS version 2 write performance is usually slower than NFS version 3. As you can see, there are many opportunities to do better than the default mount options. Here is an example of mount options that are reasonable defaults. In fact, on many newer Linux distributions, these are the default mount options. mount o rw,bg,vers=3,tcp,timeo=600,rsize=32768,wsize=32768,hard,intr Using the bg option means our client will be able to finish booting without waiting for filers that may be unavailable because of network outages. The hard option minimizes the likelihood of data loss during network and server instability, while intr allows users to interrupt applications that may be waiting for a response from an unavailable server. The tcp option works well on many typical LANs with 32KB read and write size. Using timeo=600 is a good default for TCP mounts, but for UDP mounts, timeo=4 might be more appropriate. Here is an example of a poor combination of mount options: mount o rw,soft,udp,rsize=1024,wsize=1024 Using soft mounts with UDP is a recipe for silent data corruption. On Linux 2.4 with these mount options it is made worse. The wsize=1024 mount option on Linux 2.4 mandates synchronous writes; writes go to the server one at a time rather than in groups, and the client requires the server to push data onto disk before responding to the client s write operation request. If a server gathers writes to improve disk bandwidth, it delays its response to each write request waiting for more write requests, which can trigger the client to retry write requests. A server that delays responding to writes as long as several hundred milliseconds will probably cause the client to drop requests unnecessarily after several retries (note that filers usually do not delay writes because they can cache these requests in nonvolatile RAM to fulfill NFSv2 s requirement for stable writes). To address these issues, always use the hard option and use read and write sizes larger than your client s page size. When mounting a group of home directories over a WAN, you might try: Network Appliance Inc. 15

232 TECHNICAL REPORT mount o rw,bg,vers=3,nosuid,tcp,timeo=600,retrans=2,rsize=2048, wsize=2048,soft,intr This example uses NFS over TCP because NFS clients often reside on slower, less capable networks than servers. In this case, the TCP protocol can provide fast recovery from packet losses caused by network speed transitions and noisy phone lines. Using the nosuid mount option means users cannot create or use suid programs that reside in their home directories, providing a certain degree of safety from Trojan horses. Limiting the maximum size of read and write operations gives interactive sessions on slow network links an advantage by keeping very large packets off the wire. On fast networks, large rsize and wsize values, such as 32768, are more appropriate. The soft option helps to recover quickly from server or network outages with a minimal risk of possible data loss, and the timeo=600 option allows the TCP protocol a long time to attempt recovery before the RPC client interferes. When mounting a filer from an anonymous FTP or HTTP server, you could use: mount o ro,fg,vers=3,tcp,timeo=600,retrans=2,rsize=32768,wsize=32768, hard,nointr,nocto,actimeo=600 Here we use the fg option to ensure that NFS files are available before the FTP or HTTP server is started. The ro option anticipates that the FTP or HTTP server will never write data into files. The nocto option helps reduce the number of GETATTR and LOOKUP operations at the expense of tight cache coherency with other clients. The FTP server will see changes to files on the server after its attribute cache times out (usually after about one minute). Lengthening the attribute cache timeout also reduces the attribute cache revalidation rate. When mounting a filer for use with a single-instance Oracle database system over a clean Gigabit Ethernet, you might try: mount o rw,fg,vers=3,tcp,timeo=600,retrans=2,hard,nointr Again, the fg option ensures that NFS file systems are available before the database instance starts up. We use TCP here because even though the physical network is fast and clean, TCP adds extra data integrity guarantees. The hard option ensures data integrity in the event of network problems or a cluster failover event. The nointr option prevents signals from interrupting NFS client operations. Such interruptions may occur during a shutdown abort, for instance, and are known to cause database corruption. File locking should be enabled when running databases in production as a degree of protection against corruption caused by improper backup procedures (for example, another instance of the same database running at a disaster recovery site against the same files as your normal production instance). See the appendix for more information on how to set up databases on Linux, including details on how to adjust the Linux read-ahead algorithm. In summary: Use NFS version 3 if possible. Use NFS over TCP wherever possible. If NFS over UDP is slow or hangs, this is a sign of network problems, so try TCP instead. Avoid using the soft mount option. Try the special mount options if you need an extra boost in performance. Network Appliance Inc. 16

233 TECHNICAL REPORT 5) Performance This section covers aspects of Linux client performance, with a special focus on networking. 5.1) Linux NFS client performance The Linux NFS client runs in many different environments, from light desktop usage to database with a dedicated private SAN. In general, the Linux NFS client can perform as well as most other NFS clients, and better than some, in these environments. However, you must plan your mount options and observe network behavior carefully to ensure the Linux NFS client performs at its best. On low-speed networks (10Mb/sec or 100Mb/sec), the Linux NFS client can read and write as fast as the network allows. This means Linux NFS clients running anything faster than a 400 MHz processor can saturate a 100Mb link. Slower network interfaces (for example, 16-bit PCMCIA cards on laptops) noticeably reduce client-side NFS bandwidth. Be sure your clients have enough CPU to drive the network while concurrently handling your application workload. If your clients use high-performance networking (gigabit or faster), you should plan to provide enough CPU and memory bandwidth on your clients to handle the interrupt and data rate. The NFS client software and the gigabit driver cut into resources available to applications, so make sure there are enough to go around. Most gigabit cards that support 64-bit PCI or better should provide good performance. We have found these cards work well with Linux: SysKonnect The SysKonnect SK-98XX series cards work very well with Linux and support single- and dual-fiber and copper interfaces for better performance and availability. A mature driver for this card exists in the 2.4 kernel source distribution. Broadcom Many cards and switches use this chipset, including the ubiquitous 3Com solutions. This provides a high probability of compatibility between your switches and your clients. The driver software for this chipset appeared in the Linux kernel and is included in Red Hat distributions with earlier 2.4 kernels. Check Broadcom s web site for driver updates, as several recent drivers have had performance problems. Intel EEPro/1000 This appears to be the fastest gigabit card available for Intel-based systems, but the card s driver software is included only in recent kernel source distributions ( and later) and may be somewhat unstable. You can find the card s driver software for earlier kernels at Intel s Web site. There are reports that the jumbo frame MTU for Intel s cards is only 8,998 bytes, not the standard 9,000 bytes. SysKonnect now publishes an independent Gigabit Ethernet NIC shootout on its Web site. See the appendix for information on how to obtain this document. For most purposes, Gigabit Ethernet over copper works about as well as Gigabit Ethernet over fiber for short-distance links. Category 5E or Category 6 cables are necessary for reliable performance on copper links. Fiber adds long-haul capabilities and even better reliability, but at a significant cost. Some find that copper terminations are more rugged and reliable than fiber terminations. All of these cards support Gigabit Ethernet s jumbo frames option. If you use Linux NFS clients and filers together on an unrouted network, consider using jumbo frames to improve the performance of your application. Be sure to consult your switch s command reference to make sure it is capable of handling jumbo frames in your environment. There are some known problems in Linux drivers and the Network Appliance Inc. 17

234 TECHNICAL REPORT networking layer when using the maximum frame size (9,000 bytes). If you experience unexpected performance slowdowns when using jumbo frames, try reducing the MTU to, say, 8,960 bytes. When using jumbo frames on more complex networks, ensure that every link in the network between your client and server support them and have the support enabled. If NFS over TCP is working with jumbo frames, but NFS over UDP is not, that may be a sign that some part of your network does not support jumbo frames. The Linux NFS client and network layer are sensitive to network performance and reliability. After you have set your mount options as we recommend, you should get reasonable performance. If you do not and your workload is not already CPU-bound, you should begin looking at network conditions between your clients and servers. For example, on a clean Gigabit Ethernet network, a single Linux client can write to an F880 filer as fast as the filer can put data on disk. If there is other network traffic or packet loss, write performance from a Linux client on NFS over UDP can drop rapidly, though on NFS over TCP, performance should remain reasonable. Read performance depends on the size and speed of the client s and the filer s memory. The most popular choices for 100BaseT network interfaces are the Intel EEPro/100 series and the 3Com 3C905 family. These are often built into server mainboards and can support network booting via PXE, described in more detail later in this document. A recent choice among mainboard integrators is the RTL chipset. All of these implementations perform well in general, but some versions of their driver software have known bugs. Check your distributors errata database for more information. Recent versions should work fairly well. When choosing a 100BaseT card for your Linux systems, look for a card that has a mature driver that is already integrated into the Linux kernel. Features such as checksum offloading are beneficial to performance. You can use Ethernet bonding, or trunking, to improve the reliability or performance of your client. Most network interface cards use autonegotiation to obtain the fastest settings allowed by the card and the switch port to which it attaches. Sometimes, PHY chipset incompatibilities may result in constant renegotiation or negotiating half duplex or a slow speed. When diagnosing a network problem, be sure your Ethernet settings are as you expect before looking for other problems. To solve an autonegotiation problem, you may be inclined to hard code the settings you want, but avoid this because it only masks a deeper problem. Work with your switch and card vendors to resolve these problems. In summary: Whether you use TCP or UDP, be sure you have a clean network. Ensure your network cards always negotiate the fastest settings, and that your NIC drivers are up to date. Increasing the size of the client s socket buffer (see appendix) is always a wise thing to do. 5.2) Linux NFS Client Architecture There are five layers of software between your application and the network. First, the generic VFS layer in Linux adapts system calls made by your application to generic interface calls supported by all file systems. It also manages the main file system caches, including the data cache and the attribute caches. The second layer is the NFS client itself, which adapts the generic file system interface calls into NFS RPC requests to a server. The client is responsible for converting the local environmental conditions of Network Appliance Inc. 18

235 TECHNICAL REPORT the native kernel to the NFS protocol and back. It generates RPC requests that it hands to the RPC client. The RPC client is the third layer. It takes whole NFS RPC requests from the client and converts them into socket calls. It marshals and unmarshals data, manages byte ordering, waits for server replies, and runs extra tasks, such as the flusher daemon, when necessary. The next lower layer is the Linux network layer, which handles TCP, UDP, and IP processing. The bottom layer on the client contains the network interface device driver and kernel interrupt scheduling logic. User space Application Application Application System call interface Kernel space Page cache rpciod VFS Layer VFS file system interface NFS client FS Dentry cache Inode cache Asynchronous tasks text RPC client Credential cache Socket interface TCP / UDP / IP NIC drivers Figure 1) Linux NFS client schematic diagram. There are five individual layers of software between an application and the network. These manage data and attribute caches, asynchronous tasks, and network protocol translation. Network Appliance Inc. 19

236 TECHNICAL REPORT This architecture is unlike that of some other NFS client implementations, which do not separate their NFS file system implementation from the RPC client. Other implementations use a transportindependent RPC implementation that is less efficient than using sockets directly, as Linux does. The stability and performance of what you know as the Linux NFS client depend on the ability of all of these layers to function well. Very often, the NFS client layer functions well and properly, but some other part of the kernel, such as the Linux TCP/IP implementation or page cache, or something outside your Linux client such as your network topology, causes a problem that appears as though your NFS client is misbehaving. When diagnosing client problems, you should always keep in mind that there are many possible ways for a complex system such as this to misbehave. 5.3) Diagnosing Performance Problems with the Linux NFS Client Now that you understand how the client is structured, let us review specific methods and tools for diagnosing problems on the client. The client works best when the network does not drop any packets. The NFS and RPC clients also compete with applications for available CPU resources. These are the two main categories of client performance problems you may encounter. Checking for network packet loss is the first area to look for problems. With NFS over UDP, a high retransmission count can indicate packet loss due to network or server problems. With NFS over TCP, the network layer on the client handles network packet loss, but server problems still show up as retransmissions. On some 2.4 kernels, TCP retransmissions are also a sign of large application writes on the client that have filled the RPC transport socket s output buffer. To see retransmissions, you can use nfsstat c at a shell prompt. At the top of the output, you will see the total number of RPCs the client has sent and the number of times the client had to retransmit an RPC. The retransmit rate is determined by dividing the number of retransmissions by the total number of RPCs. If the rate exceeds a few tenths of a percent, network losses may be a problem for your performance. NFS over TCP does not show up network problems as clearly as UDP and performs better in the face of packet loss. If your TCP mounts run faster than your UDP mounts, that s a sure sign that the network between your clients and your filer is dropping packets or is otherwise bandwidth-limited. Normally UDP is as fast as or slightly faster than TCP. The client keeps network statistics that you can view with netstat s at a shell prompt. Look for high error counts in the IP, UDP, and TCP sections of this command s output. The same command also works on a filer s console. Here look for nonzero counts in the fragments dropped after timeout and fragments dropped (dup or out of space) fields in the IP section. There are a few basic sources of packet loss. 1. If the end-to-end connection between your clients and servers contains links of different speeds (for instance, the server is connected via Gigabit Ethernet, but the clients are all connected to the network with 100Base-TX), packet loss occurs at the point where the two speeds meet. If a gigabit-connected server sends a constant gigabit stream to a 100Mb client, only one packet in 10 can get to the client. UDP does not have any flow control built in to slow the server s transmission rate, but TCP does; thus, it provides reasonable performance through a link speed change. 2. Another source of packet loss is small packet buffers on switches. If either the client or server bursts a large number of packets, the switch may buffer them before sending them on. If the Network Appliance Inc. 20

237 TECHNICAL REPORT switch buffer overflows, the packets are lost. It is also possible that a switch can overrun a client s NIC in a similar fashion. This becomes a greater possibility for large UDP datagrams because switches and NICs tend to burst an entire IP packet (all of its fragments) at once. 3. The client s IP layer will drop UDP datagrams if the client s send or receive socket buffer runs out of space for any reason. The client s RPC client allocates a socket for each mount. By default, these sockets use 64KB input and output buffers, which is too small on systems that use large rsize or wsize or generate a large number of NFS operations in a short period. To increase the size of these buffers, follow the instructions in the appendix to this document. There is no harm in doing this on all your Linux clients unless they have less than 16MB of RAM. If you have resolved these issues and still have poor performance, you can attempt end-to-end performance testing between one of your clients and a similar system on the server s LAN using a tool such as ttcp or iperf. This exposes problems that occur in the network outside of the NFS protocol. When running tests such as iperf, select UDP tests, as these directly expose network problems. If your network is full duplex, run iperf tests in both directions concurrently to ensure your network is capable of handling a full load of traffic in both directions simultaneously. If you find some client operations work normally while others cause the client to stop responding, try reducing your rsize and wsize to 1,024 to see if there are fragmentation problems preventing large operations from succeeding. One more piece of network advice: Become familiar with network snooping tools such as tcpdump, tcpslice, and ethereal. On the filer, you can run pktt, which generates trace files in tcpdump format that you can analyze later on a client. These tools provide the last word in what is really happening on your network between your clients and filers. You may need to run both tcpdump on a client and pktt on your filer at the same time and compare the traces to determine where the problem lies. You must explicitly specify several options to collect clean network traces with tcpdump. Be sure the snaplen option (-s) is set large enough to capture all the interesting bytes in each packet, but small enough that tcpdump is not overwhelmed with incoming traffic. If tcpdump is overwhelmed, it drops incoming packets, making the network trace incomplete. The default value is 96 bytes, which is too short to capture all the RPC and NFS headers in each packet. Usually a value of 256 bytes is a good compromise for UDP, but you can set it to zero if you need to see all the data in each packet. Snooping TCP packets requires a zero snaplen because TCP can place several RPC requests in a single network packet. If snaplen is short, the trace will miss RPCs that are contained near the end of long packets. In addition, always use filtering to capture just the traffic between the client and the server. Again, this reduces the likelihood that tcpdump or your local file system will be overwhelmed by incoming traffic and makes later analysis easier. You can collect traffic to or from your client using the hostname filter. Several other tcpdump options allow you to collect traffic destined for one or more hosts at a time; read the manual to find out more. An automounter can cause a lot of network chatter, so it is best to disable the automounter on your client and set up your mounts by hand before taking a network trace. Network Appliance Inc. 21

238 TECHNICAL REPORT To find out if your application is competing for CPU resources with the NFS client, RPC client, or network layer on your client system, you can use the top program. If you see the rpciod process at the top of the listing, you know that NFS operations are dominating the CPU on your system. In addition, if you see system CPU percentage increase significantly when your application accesses NFS data, this also can indicate a CPU shortage. In many cases, adding more CPUs or faster CPUs helps. Switching to UDP may also be a choice for reducing CPU load if an uncongested high-speed network connects your clients and filer. As the Linux NFS client and networking layer improve over time, they will become more CPU-efficient and include features such as TCP offload that reduce the amount of CPU required to handle large amounts of data. There are certain cases where processes accessing NFS file systems may hang. This is most often due to a network partition or server outage. Today s client implementation is robust enough to recover in most cases. Occasionally a client fails because of high load or some other problem. Unfortunately, little can be done in these cases other than rebooting the client and reporting the problem. 5.4) Error Messages in the Kernel Log There are two messages that you may encounter frequently in the kernel log (this is located in /var/log/messages on Linux systems). The first is server not responding. This message occurs after the client retransmits several times without any response from a server. If you know the server is up, this can indicate the server is sluggish or there are network problems. If you know the server is down, this indicates the client is waiting for outstanding operations to complete on that server, and it is likely there are programs waiting for the server to respond. The second, perhaps more frustrating, message is can t get request slot. This message indicates that the RPC client is queuing messages and cannot send them. This is usually due to network problems such as a bad cable, incorrectly set duplex or flow control options, or an overloaded switch. It may appear as if your client is stuck at this point, but you should always wait at least 15 minutes for network and RPC client timeouts to recover before trying harsher remedies such as rebooting your client or filer. 5.5) Oops If a program encounters an unrecoverable situation in user space, it stops running and dumps core. If the same process encounters an unrecoverable situation within the Linux kernel, the kernel attempts to isolate the process, stop it, and report the event in the kernel log. This is called an oops. Many times a failing process holds system resources that are not released during an oops. The kernel can run for a few more moments, but generally other processes deadlock or themselves terminate with an oops when they attempt to allocate resources that were held by the original failing process. If you are lucky, the kernel log contains useful information after the system recovers. Red Hat kernels automatically translate the kernel log output into symbolic information that means something to kernel developers. If the oops record contains only hexadecimal addresses, try using the ksymoops tool in the kernel source tree to decode the addresses. Sometimes using a serial console setup can help capture output that a failing kernel cannot write into its log. Use a crossover cable (also sometimes called a file transfer or null modem cable) to connect your client to another system s serial console via its COM1 or COM2 port. On the receiving system, use a tool such as the minicom program to access the serial port. Serial console support is built into kernels distributed by Red Hat, but be sure to enable the serial console option in kernels you build yourself. Finally, you should set appropriate boot command line options (instructions provided in the Network Appliance Inc. 22

239 TECHNICAL REPORT Documentation directory of the Linux kernel source tree) to finish enabling the serial console. Optionally, you can also update /etc/inittab to start a mingetty process on your serial console if you d like to log in there. 5.6) Getting Help Most Linux NFS client performance problems are due to lack of CPU or memory on the client, incorrect mount options, or packet losses on the network between the client and servers. If you have set up your client correctly and your network is clean, but you still suffer from performance or reliability problems, you should contact experts to help you proceed further. Currently, there is no professionally maintained knowledge base that tracks Linux NFS client issues. However, expert help is available on the Web at nfs.sourceforge.net, where you can find a Linux NFS Frequently Asked Questions list, as well as several how-to documents. There is also a mailing list specifically for helping administrators get the best from Linux NFS clients and servers. Network Appliance customers can also search the NOW database for Linux-related issues. Network Appliance also maintains some of the more salient Linux issues within its BURT database. See the appendix in this report for more information. If you find there are missing features or performance or reliability problems, we encourage you to participate in the community development process. Unlike proprietary operating systems, new features appear in Linux only when users implement them. Problems are fixed when users are diligent about reporting them and following up to see that they are really fixed. If you have ever complained about the Linux NFS client, here is your opportunity to do something about it. When you have found a problem with the Linux NFS client, you can report it to your Linux distributor. Red Hat, for instance, supports an online bug database based on bugzilla. You can access Red Hat s bugzilla instance at When filing a BURT that relates to Linux client misbehavior with a filer, be sure you report: The Linux distribution and the Linux kernel release (e.g., Red Hat 7.2 with kernel ). The client s kernel configuration (/usr/src/linux/.config is the usual location) if you built the kernel yourself. Any error messages that appear in the kernel log, such as oops output or reports of network or server problems. All mount options in effect (use cat /proc/mounts to display them, and do not assume they are the same as the options you specified on your mount commands). Details about the network topology between the client and the filer, such as how busy the network is, how many switches and routers, what link speeds, and so on. You can report network statistics on the client with nfsstat c and netstat s. Client hardware details, such as SMP or UP, which NIC, and how much memory. You can use the lspci v command and cat /proc/cpuinfo on most distributions to collect most of this. Include a network trace and/or a dump of debugging messages (see the appendix). Network Appliance Inc. 23

240 TECHNICAL REPORT Most importantly, you should carefully describe the symptoms on the client. A client hang is generally not specific enough. This could mean the whole client system has deadlocked or that an application on the client has stopped running. Always be as specific as you can. In summary: If you cannot find what you need in this paper or from other resources, contact your Linux distributor or Network Appliance to ask for help. Network Appliance Inc. 24

241 TECHNICAL REPORT 6) Other Sundries This section covers auxiliary services you may need to support advanced NFS features. 6.1) Telling Time The clock on your Linux clients must remain synchronized with your filers to avoid problems such as authentication failures or incomplete software builds. Usually you set up a network time service such as NTP and configure your filers and clients to update their time using this service. After you have properly configured a network time service, you can find more information on enabling NTP on your filers in the Data ONTAP System Administrator s Guide. Linux distributions usually come with a prebuilt network time protocol daemon. If your distribution does not have an NTP daemon, you can build and install one yourself by downloading the latest ntpd package from the Internet (see the appendix). There is little documentation available for the preinstalled NTP daemon on Linux. To enable NTP on your clients, be sure the ntpd startup script runs when your client boots (look in /etc/rc.d or /etc/init.d; the exact location varies, depending on your distribution; for Red Hat systems, you can use chkconfig level 35 ntpd on). You must add the network time server s IP address to /etc/ntp/step-tickers and /etc/ntp.conf. If you find that the time protocol daemon is having some difficulty maintaining synchronization with your time servers, you may need to create a new drift file. Make sure your client s /etc/ntp directory and its contents are permitted to the ntp user and group to allow the daemon to update the drift file, and disable authentication and restriction commands in /etc/ntp.conf until you are sure everything is working correctly. As root, shut down the time daemon and delete the drift file (usually /etc/ntp/drift). Now restart the time daemon again. After about 90 minutes, it will write a new drift file into /etc/ntp/drift. Your client system should keep better time after that. Always keep the date, time, and time zone on your filer and clients synchronized. Not only will you ensure that any time-based caching on your clients work correctly, but it will also make debugging easier by aligning time stamps in client logs and on client network trace events with the filer s message log and pktt traces. 6.2) Security Today, the Linux NFS client supports only two types of authentication: AUTH_NULL and AUTH_UNIX. In future releases, Linux will support Kerberos 5, just as Solaris does today, via RPCSEC GSS. Later versions of the NFS protocol (e.g., NFSv4) support a wide variety of authentication and security models, including Kerberos, and a form of public key authentication called SPKM. To maintain the overall security of your Linux clients, be sure to check for and install the latest security updates from your distributor. You can find specific information on Linux NFS client security in Chapter 6 of the NFS how-to located at nfs.sourceforge.net (see appendix). When a host wants to send UDP packets that are larger than the network s maximum transfer unit, or MTU, it must fragment them. Linux divides large UDP packets into MTU-sized IP fragments and sends them to the receiving host in reverse order; that is, the fragment that contains the bytes with the highest offset are sent first. Because Linux sends IP fragments in reverse order, its NFS client may not interoperate with some firewalls. Certain modern firewalls examine the first fragment in a packet to Network Appliance Inc. 25

242 TECHNICAL REPORT determine whether to pass the rest of the fragments. If the fragments arrive in reverse order, the firewall discards the whole packet. A possible workaround is to use only NFS over TCP when crossing such firewalls. If you find some client operations work normally while others cause the client to stop responding, try reducing your rsize and wsize to 1,024 to see if there are fragmentation problems preventing large RPC requests from succeeding. Firewall configuration can also block auxiliary ports that the NFS protocol requires to operate. For example, traffic may be able to pass on the main NFS port numbers, but if a firewall blocks the mount protocol or the lock manager or port manager ports, NFS cannot work. This applies to standalone router/firewall systems as well as local firewall applications such as tcpwrapper, ipchains, or iptables that might run on the client system itself. Be sure to check if there are any rules in /etc/hosts.deny that might prevent communications between your client and server. 6.3) Network Lock Manager The NFS version 2 and 3 protocols use separate side-band protocols to manage file locking. On Linux 2.4 kernels, the lockd daemon manages file locks using the NLM (Network Lock Manager) protocol, and the rpc.statd program manages lock recovery using the NSM (Network Status Monitor) protocol to report server and client reboots. The lockd daemon runs in the kernel and is started automatically when the kernel starts up at boot time. The rpc.statd program is a user-level process that is started during system initialization from an init script. If rpc.statd is not able to contact servers when the client starts up, stale locks will remain on the servers that can interfere with the normal operation of applications. The rpcinfo command on Linux can help determine whether these services have started and are available. If rpc.statd is not running, use the chkconfig program to check that its init script (which is usually /etc/init.d/nfslock) is enabled to run during system bootup. If the client host s network stack is not fully initialized when rpc.statd runs during system startup, rpc.statd may not send a reboot notification to all servers. Some reasons network stack initialization can be delayed are slow NIC devices, slow DHCP service, or CPU-intensive programs running during system startup. Network problems external to the client host may also cause these symptoms. Because status monitoring requires bidirectional communication between server and client, some firewall configurations can prevent lock recovery from working. Firewalls may also significantly restrict communication between a client s lock manager and a server. Network traces captured on the client and server at the same time usually reveal a networking or firewall misconfiguration. Read the section on using Linux NFS with firewalls carefully if you suspect a firewall is preventing lock management from working. Your client s nodename determines how a filer recognizes file lock owners. You can easily find out what your client s nodename is using the uname n or hostname command. (A system s nodename is set on Red Hat systems during boot using the HOSTNAME value set in /etc/sysconfig/network.) The rpc.statd daemon determines which name to use by calling gethostbyname(3), or you can specify it explicitly when starting rpc.statd using the -n option. If the client s nodename is fully qualified (that is, it contains the hostname and the domain name spelled out), then rpc.statd must also use a fully qualified name. Likewise, if the nodename is unqualified, then rpc.statd must use an unqualified name. If the two values do not match, lock recovery will not work. Be sure the result of gethostbyname(3) matches the output of uname n by adjusting your client s nodename in /etc/hosts, DNS, or your NIS databases. Network Appliance Inc. 26

243 TECHNICAL REPORT Similarly, you should account for client hostname clashes in different subdomains by ensuring that you always use a fully qualified domain name when setting up a client s nodename during installation. With multihomed hosts and aliased hostnames, you can use rpc.statd s -n option to set unique hostnames for each interface. The easiest approach is to use each client s fully qualified domain name as its nodename. When working in high-availability database environments, test all worst-case scenarios (such as server crash, client crash, application crash, network partition, and so on) to ensure lock recovery is functioning correctly before you deploy your database in a production environment. Ideally, you should examine network traces and the kernel log before, during, and after the locking/disaster/locking recovery events. The file system containing /var/lib/nfs must be persistent across client reboots. This directory is where the rpc.statd program stores information about servers that are holding locks for the local NFS client. A tmpfs file system, for instance, is not sufficient; the server will fail to be notified that it must release any POSIX locks it might think your client is holding if it fails to shut down cleanly. That can cause a deadlock the next time you try to access a file that was locked before the client restarted. Locking files in NFS can affect the performance of your application. The NFS client assumes that if an application locks and unlocks a file, it wishes to share that file s data among cooperating applications running on multiple clients. When an application locks a file, the NFS client purges any data it has already cached for the file, forcing any read operation after the lock to go back to the server. When an application unlocks a file, the NFS client flushes any writes that may have occurred while the file was locked. In this way, the client greatly increases the probability that locking applications can see all previous changes to the file. However, this increased data cache coherency comes at the cost of decreased performance. In some cases, all of the processes that share a file reside on the same client; thus aggressive cache purging and flushing unnecessarily hamper the performance of the application. Solaris allows administrators to disable the extra cache purging and flushing that occur when applications lock and unlock files with the llock mount option. Note well that this is not the same as the nolock mount option in Linux. The nolock mount option disables NLM calls by the client, but the client continues to use aggressive cache purging and flushing. Essentially this is the opposite of what Solaris does when llock is in effect. 6.4) Using the Linux Automounter For an introduction to configuring and using NFS automounters, consult Chapter 9 of O Reilly s Managing NFS and NIS, 2 nd Edition (see the appendix for URL and ISBN information). Because Linux minor device numbers have only eight bits, a single client cannot mount more than 250 or so NFS file systems. The major number for NFS mounts is the same as for other file systems that do not associate a local disk with a mount point. These are known as anonymous file systems. Because the NFS client shares the minor number range with other anonymous file systems, the maximum number of mounted NFS file systems can be even less than 250. In later releases of Linux, more anonymous device numbers are available, thus the limit is somewhat higher. The preferred mechanism to work around this problem is to use an automounter. This also helps performance problems that occur when mounting very large root-level directories. There are two Linux automounters available: AMD and automounter. The autofs file system is required by both and comes built into modern Linux distributions. More information is available on Linux automounters on the Web; see the appendix for the specific URL. Network Appliance Inc. 27

244 TECHNICAL REPORT A known problem with the automounter in Linux is that it polls NFS servers on every port before actually completing each mount to be sure the mount request won t hang. This can result in significant delays before an automounted file system becomes available. If your applications hang briefly when they transition into an automounted file system, make sure your network is clean and that the automounter is not using TCP to probe the filer s portmapper. Ensure your infrastructure services, such as DNS, respond quickly and with consistent results. Also consider upgrading your filer to the latest release of Data ONTAP. An automounter can cause a lot of network chatter, so it is best to disable the automounter on your client and set up static mounts before taking a network trace. Automounters depend on the availability of several network infrastructure services. If any of these services is not reliable or performs poorly, it can adversely affect the performance and availability of your NFS clients. When diagnosing an NFS client problem, triple-check your automounter configuration first. It is often wise to disable the automounter before drilling into client problem diagnosis. The Linux automounter is a single process, and handles a single mount request at a time. If one such request becomes stuck, the automounter will no longer respond, causing applications to hang while waiting to enter a file system that has yet to be mounted. If you find hanging applications on a client that is managed with the automounter, be sure to check that the automounter is alive and is responding to requests. Some versions of Data ONTAP do not allow mount operations to occur during the small window when a fresh version of a snapmirrored replica is brought online. If the automounter attempts to mount a volume during this brief window, it will fail, but a mount request moments later will succeed. There is no workaround for this problem by adjusting your automounter configuration, but upgrading to the latest version of Data ONTAP should resolve the issue. Using an automounter is not recommended for production servers that may need immediate access to files after long periods of inactivity. Oracle, for example, may need immediate access to its archive files every so often, but an automounter may unmount the archive file system due to inactivity. 6.5) Net booting your Linux NFS clients Intel systems can support network booting using DHCP, BOOTP, and TFTP. The Intel standard for supporting network booting is called a preexecution environment, or PXE. Usually this requires network interface hardware that contains a special PROM module that controls the network boot process. Network booting is especially helpful for managing clusters, blade servers, or a large number of workstations that are similarly configured. Generally, Linux is loaded via a secondary loader such as grub, lilo, or syslinux. The secondary loader of choice for network booting is pxelinux, which comes in the syslinux distribution. See the appendix for information on how to obtain and install the syslinux distribution. Data ONTAP releases and 6.4 support pxelinux, allowing Linux to boot over a network. Earlier versions of Data ONTAP support booting filers, but do not support pxelinux because certain TFTP options were missing in the filer s TFTP server. To enable TFTP access to your filer, see the Data ONTAP System Administrator s Guide. You must ensure that your client hardware supports network booting. Both the client s mainboard and network interface card must have support for network booting built in. Usually you can tell whether network booting is supported by reviewing the BIOS settings on your client or by consulting the Network Appliance Inc. 28

245 TECHNICAL REPORT mainboard manual for information on how to set up network booting on your client hardware. The specific settings vary from manufacturer to manufacturer. You must configure a DHCP server on the same LAN as your clients. A DHCP server provides unique network configuration information for each host on a commonly administered network. For network booting, the DHCP server also instructs each client where to find that client s boot image. You can find instructions for configuring your DHCP server to support network booting included with the pxelinux distribution. The specific instructions for setting up a DHCP server vary from vendor to vendor. If you intend to share a common root file system among multiple Linux clients, you must create the root file system on a filer using NFSv2. This is because of problems with how filers interpret major and minor device numbers and because of differences between how NFSv2 and NFSv3 Linux clients transmit these numbers. Linux clients, unless told otherwise, attempt to mount their root file systems with NFSv2. If the file system was created using NFSv3, the major and minor numbers will appear incorrect when mounted with NFSv2. Kernels use these numbers to match the correct device driver to device special files (files that represent character and block devices). If the numbers are wrong, the Linux kernel will not be able to find its console or root file system, thus it cannot boot. When setting up multiple clients with NFS root file systems, common practice is to maintain a separate root file system for each client and mount each with ro,nolock. Sharing a root file system among clients is not recommended. Note that /var/lib/nfs must be persistent across reboots and unique for each client. A tmpfs file system per client, for instance, is not sufficient; the server will fail to be notified that it must release any POSIX locks it might think your client is holding if it fails to shut down cleanly. That can cause a deadlock the next time you try to access a file locked before the client restarted. If each client mounts a private /var/lib/nfs directory via NFS, it must be mounted using the nolock mount option, and before rpc.statd and lockd have started on the client. For more information on Linux cluster computing, search the Internet for Beowulf or OpenMosix, or see the Linux high-availability site listed in the appendix of this document. In summary: Make sure to check with your Linux distributor for any errata on a regular basis to maintain a secure system. Be sure your filers and clients agree on what time it is. If your client has trouble seeing the filer, look for packet filtering on your client or switches. Make each client s nodename its fully qualified domain name. Network Appliance Inc. 29

246 TECHNICAL REPORT 7) Executive Summary When setting up a Linux NFS client, you should try to get the latest kernel supported by your Linux distributor or hardware vendor. Mount with NFS over TCP and NFS version 3 where possible, and use the largest rsize and wsize that still provide good performance. Start with the hard and intr mount options. Be sure the network between your clients and filer drops as few packets as possible. If you must use NFS over UDP, make sure to follow the instructions in the appendix for enlarging the transport socket buffers on all your UDP mounts. If you have special needs, review this document carefully. Always look for the latest errata and bug fixes from your Linux distributor and watch for new Network Appliance technical reports. Network Appliance Inc. 30

247 TECHNICAL REPORT 8) Appendix 8.1) Related Material Network Appliance NOW Web site: Type in Linux in the NOW PowerSearch text box. SysKonnect s Gigabit Ethernet performance study Red Hat 7.3 performance alert details O Reilly s Managing NFS & NIS, Second Edition (ISBN ) Linux tcpdump manual page man tcpdump Tcpdump home page Linux ethereal manual page man ethereal Ethereal home page Data ONTAP packet tracer type pktt list at your filer s console Linux NFS manual page man nfs Linux NFS FAQ and how-to Linux NFS mailing list [email protected] Linux network boot loader information Linux manual page with automounter information man autofs Linux automounter information Linux high-availability site Network Time Protocol daemon home pages Network Appliance Inc. 31

248 TECHNICAL REPORT 8.2) Special network settings Enlarging the transport socket buffers your client uses for NFS traffic helps reduce resource contention on the client, reduces performance variance, and improves maximum data and operation throughput. In Linux kernels after , the following procedure is not necessary, as the client will automatically choose an optimal socket buffer size. 1. Become root on your client 2. cd into /proc/sys/net/core 3. echo > rmem_max 4. echo > wmem_max 5. echo > rmem_default 6. echo > wmem_default 7. Remount your NFS file systems on the client This is especially useful for NFS over UDP and when using Gigabit Ethernet. You should consider adding this to a system startup script that runs before the system mounts NFS file systems. The size we recommend is the largest safe socket buffer size we ve tested. On clients smaller than 16MB, you should leave the default socket buffer size setting to conserve memory. Most modern Linux distributions contain a file called /etc/sysctl.conf where you can add changes such as this so they will be executed after every system reboot. Add these lines to your /etc/sysctl.conf file on your client systems: net.core.rmem_max = net.core.wmem_max = net.core.rmem_default = net.core.wmem_default = All Linux kernels later than 2.0 support large TCP windows (RFC 1323) by default. No modification is needed to enable large TCP windows. Window scaling is enabled by default. Some customers have found the following settings to help performance in WAN and high-performance LAN network environments. Use these settings only after thorough testing in your own environment. > # Netapp filer: > nfs.tcp.recvwindowsize > nfs.ifc.xmt.high 64 > nfs.ifc.xmt.low 8 > # Linux NFS client: > net.core.rmem_default=65536 > net.core.wmem_default=65536 > net.core.rmem_max= > net.core.wmem_max= > net.ipv4.tcp_rmem = > net.ipv4.tcp_wmem = > # following is in pages, not bytes > net.ipv4.tcp_mem = Usually the following setting is the default for common GbE hardware: Network Appliance Inc. 32

249 TECHNICAL REPORT ifconfig eth <dev> txqueuelen 1000 And net.core.netdev_max_backlog=3000 Linux 2.6 kernels support advanced TCP algorithms that may help with WAN performance. Enabling tcp_bic or tcp_westwood can have some beneficial effects for WAN performance and overall fairness of sharing network bandwidth resources among multiple connections. net.ipv4.tcp_bic=1 net.ipv4.tcp_westwood=1 Linux 2.4 kernels cache the slow start threshold in a single variable for all connections going to the same remote host. So, packet loss on one RPC transport socket will affect the slow start threshold on all sockets connecting to that server. The cached value remains for ten minutes. To flush the cache, you can use (as root): sysctl w net.ipv4.route.flush=1 This might be necessary in case some network conditions caused problems, but have been cleared. Cached ssthresh values will prevent good performance for ten minutes on new connections made after the network problems have been cleared up. If you experience ARP storms, this could be the result of client or filer ARP caches that are too small. A reasonable workaround is to use routers to reduce the size of your physical networks. These tuning parameters are documented in the kernel source tree in Documentation/networking/ip-sysctl.txt. 8.3) Controlling File Read-Ahead in Linux Read-ahead occurs when Linux predicts that an application may soon require file data it has not already requested. Such prediction is not always accurate, so tuning read-ahead behavior can have some benefit. Certain workloads benefit from more aggressive read-ahead, while other workloads perform better with little or no read-ahead. By default, Linux 2.4 kernels will attempt to read ahead by at least three pages, and up to 31 pages, when it detects sequential read requests from an application. Some file systems use their own private read-ahead values, but the NFS client uses the system defaults. To control the amount of read-ahead performed by Linux, you can tune the system default read-ahead parameters using the sysctl command. To see what your system s current default read-ahead parameters are, you can try: sysctl vm.min-readahead vm.max-readahead The min-readahead parameter sets the least amount of read-ahead the client will attempt, and the max-readahead parameter sets the most read-ahead the client may attempt, in pages. Linux determines dynamically how many pages to read ahead based on the sequentiality of your application s read requests. Note that these settings affect all reads on all NFS file systems on your client system. Network Appliance Inc. 33

250 TECHNICAL REPORT You can increase read-ahead if you know your workload is mostly sequential data or you are trying to improve WAN performance. 1. Become root 2. sysctl w vm.max-readahead= sysctl w vm.min-readahead=15 will set your system s read-ahead minimum and maximum to relatively high values, allowing Linux s read-ahead algorithm to read ahead as many as 255 pages. This value takes effect immediately. Usually the best setting for min-readahead is the number of pages in rsize, minus one. For example, if your client uses typically rsize=32768 when mounting NFS servers, you should set min-readahead to 7. You can add this to the /etc/sysctl.conf file on your client if it supports this; see the section above for details. The 2.6 Linux kernel does not support adjusting read-ahead behavior via a sysctl parameter. On client systems that support a database workload, try setting the minimum and maximum read-ahead values to one or zero. This optimizes Linux s read-ahead algorithm for a random-access workload and prevents the read-ahead algorithm from polluting the client s data cache with unneeded data. As always, test your workload with these new settings before making changes to your production systems. 8.4) How to Enable Trace Messages Sometimes it is useful to enable trace messages in the NFS or RPC client to see what it does when handling (or mishandling) an application workload. Normally you should use this only when asked by an expert for more information about a problem. You can do this by issuing the following commands: 1. Become root on your client 2. sysctl w sunrpc.nfs_debug=1 3. sysctl w sunrpc.rpc_debug=1 Trace messages appear in your system log, usually /var/log/messages. To disable these trace messages, echo a zero into the same files. This can generate an enormous amount of system log traffic, so it can slow down the client and cause timing-sensitive problems to disappear or change in behavior. You should use this when you have a simple, narrow test case that reproduces the symptom you are trying to resolve. To disable debugging, simply echo a zero into the same files. To help the syslogger keep up with the log traffic, you can disable synchronous logging by editing /etc/syslog.conf and appending a hyphen in front of /var/log/messages. Restart the syslog daemon to pick up the updated configuration. 8.5) How to Enable Uncached I/O on RHEL AS 2.1 Red Hat Enterprise Linux Advanced Server 2.1 Update 3 introduces a new feature that is designed to assist database workloads by disabling data caching in the operating system. This new feature is called uncached NFS I/O and is similar to the NFS O_DIRECT feature found in Enterprise Linux 3.0 and SUSE s SLES 8 Service Pack 3. When this feature is enabled, an application s read and write system calls are translated directly into NFS read and write operations. The Linux kernel never caches the results of any read or write, so applications always get exactly what s on the server. Uncached I/O affects an entire mount point at once, unlike NFS O_DIRECT, which affects only a single file at a time. System administrators can combine mount points that are uncached (say, for shared data files) and mount points that cache data normally (say, for program executables or home directories) on Network Appliance Inc. 34

251 TECHNICAL REPORT the same client. Also unlike NFS O_DIRECT, uncached I/O is compatible with normal I/O. There are no alignment restrictions, so any application can use uncached I/O without modification. When uncached I/O is in effect, it changes the semantics of the noac mount option. Normally the noac mount option means that attribute caching is disabled. When the uncached I/O feature is in effect, noac is changed to mean that data caching is disabled. When uncached I/O is not in effect, noac mount points behave as before. Uncached I/O is turned off by default. To enable uncached I/O, follow this procedure: 1. Become root 2. Start your favorite editor on /etc/modules.conf 3. Add this line anywhere: options nfs nfs_uncached_io=1 Uncached I/O will take effect after you reboot your client. Only mount points that use the noac mount option will be effected by this change. For more information on how to use this feature with Oracle9i RAC, see NetApp TR Network Appliance Inc Network Appliance, Inc. All rights reserved. Specifications subject to change without notice. NetApp, NetCache, and the Network Appliance logo are registered trademarks and Network Appliance, DataFabric, and The evolution of storage are trademarks of Network Appliance, Inc., in the U.S. and other countries. Oracle is a registered trademark of Oracle Corporation. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such. 35

252 9 Partage de fichiers NFS 9.16 Annexe 3 Chapitre 9 Partage de fichiers NFS 9.16 Annexe 3 Ci joint dans la version imprimée de ce cours, un document sur le protocole NFS V4 écrit par NETWORK-APPLIANCE. (copie sur site c T.Besançon (v ) Administration UNIX ARS Partie / 468

253 NFS, its applications and future Brian Pawlowski Vice President and Chief Architect NFS: Its applications and future - LISA 04 Who am I? Why am I here? NFS: Its applications and future - LISA 04

254 The arc of the presentation What is NFS? The evolution of NFS NFS Version 4 Drill down: Linux NFS Version 4 What about iscsi? Linux compute clusters NFS in context Challenges for NFS With occasional sidetracks NFS: Its applications and future - LISA 04 What is NFS? NFS: Its applications and future - LISA 04

255 What is NFS? NFS is a protocol for a distributed filesystem. Based on Sun's RPC version 2 protocol Can export arbitrary local disk formats First revision, NFSv2, was published in It exports basic POSIX 32-bit filesystems Slow, particularly for writing files NFSv3, was published in 1994 Extended to 64-bit files & improved write caching It is perhaps the most commonly used protocol for sharing files on *NIX/Linux LANs today NFS: Its applications and future - LISA 04 Remember these file systems? Apollo Domain AT&T Remote File System (RFS) Andrew File System (AFS) Distributed File System (DFS) (NFS Version 4 is influenced by AFS) NFS: Its applications and future - LISA 04

256 NFS Today It was 20 years ago today. SCSI and NFS grew up together Transformed from something you turn on in a UNIX release to a well-defined storage segment Home directories Large partitionable tasks that may run as parallel threads Typical applications include search engines, , animation and rendering, scientific simulations, and engineering Scalable databases GRID computing NFS: Its applications and future - LISA 04 NFS Version 4 NFS: Its applications and future - LISA 04

257 NFS Version 4 Openly specified distributed filesystem NFSv2/v3 quasi-open with Informational RFC Windows, AFS, DFS not open Well-suited for complex WAN deployment and firewalled architectures Reduced latency, public key security Strong security Public and Private key Fine-grained access control Improved multi-platform support Extensible Lays groundwork for migration/replication and global naming NFS: Its applications and future - LISA 04 The IETF process and NFS Sun/IETF Agreement Strawman Proposal from Sun Working Group Draft BOF, working group forms Meetings, writing, Prototyping by 5 organizations Additional prototyping Six working group drafts Working Group Last Call IETF Last Call IESG Review Assign RFC number Proposed Standard RFC NFS: Its applications and future - LISA 04 Proposed Standard RFC 3530 Two independent implementations 6+ months Draft Standard Internet Standard apotheosis

258 Couple things we did right this time Open source reference implementations of NFS Version 4 were funded early (started by Sun) Interoperability events held 3 times a year With focus of non-connectathon events on NFS Version 4 Huge improvements in execution and coordination over NFS Version 3 NFS: Its applications and future - LISA 04 NFS Protocol Stack NFSv4 (RFC3530) KerberosV5 (RFC1510) SPKM-3 LIPKEY (RFC2847) RPC (RFC1831) XDR (RFC1832) RPCSEC_GSS (RFC2203) TCP* NFS: Its applications and future - LISA 04

259 NFS Version 4 operations ACCESS CLOSE COMMIT CREATE DELEGPURGE DELEGRETURN GETATTR GETFH LINK LOCK LOCKT LOCKU LOOKUP LOOKUPP NVERIFY OPEN OPENATTR OPEN_CONFIRM OPEN_DOWNGRADE PUTFH PUTPUBFH PUTROOTFH READ READDIR READLINK RENAME RESTOREFH SAVEFH SECINFO SETATTR SETCLIENTID SETCLIENTID_CONFIRM VERIFY WRITE RELEASE_LOCKOWNER NFS: Its applications and future - LISA 04 NFS operation aggregation The COMPOUND operation NFS procedures are now groupable Potential for reduced latencies and roundtrip times Part of framework for minor versioning NFS: Its applications and future - LISA 04

260 Example: mount server:/test/dir Client generates this COMPOUND PUTROOFH GETFH LOOKUP(test) GETFH GETATTR SYMLINK_SUPPORT LINK_SUPPORT FH_EXPIRE_TYPE TYPE SUPPORTED_ATTRS FSID SECINFO (dir) LOOKUP (dir) GETFH GETATTR The operation formerly known as MOUNT NFS: Its applications and future - LISA 04 NFS Version 4 is secure Mandatory to implement Optional to use Extensible via GSSAPI RPC Kerberos V5 available now Public key flavor emerging Security negotiated Per file system policy Continuous security Security negotiation mechanisms Levels Authentication Integrity Privacy ACLs (based on Windows NT) NFS: Its applications and future - LISA 04

261 Specification vs. implementation RFC 3530 defines required, recommended and optional features The required features form core of interoperability Recommended and optional features are negotiated ACLs, for example, are a recommended attribute Not required for compliance Dependent on underlying local file system support on server (on Linux - that s a lot of file systems) ACLs are ill-defined in *ix environs - mapping issues are tripping us up NFS: Its applications and future - LISA 04 NFSv4 - Stateful Protocol is session oriented (OPEN call exists) But in reality NFS Version 3 was also stateful via adjunct Locking Protocol Lease-based recovery of state (simplified error handling) File locking integrated into protocol OPEN provides atomicity for Windows integration Addition of delegations Client in absence of sharing allowed to locally cache data and locking state This does not mean data sharing is defined in absence of explicit locking NFS: Its applications and future - LISA 04

262 Delegations (making lemonade) Use stateful design to enhance performance and scalability Enables aggressive caching on client Shared read Write exclusive Reduced roundtrips of the wire Read, write, locking etc. cacheable locally The fastest packet is the one never sent Server-based policy Server manages conflicts Sharing model reverts to NFS Version 3 when conflicted NFS: Its applications and future - LISA 04 The Pseudo file system Server Local FS / A B C D G E F H I Pseudofs / A C D F I NFS: Its applications and future - LISA 04

263 Protocol vs. Implementation II Administration amongst the remaining *ixes differ Recommended and optional features require negotiation Security Extensions NFS: Its applications and future - LISA 04 Simplified server namespace Exported file systems mountable from a single server root Root filehandle for the top of the file tree Still a client mount command For all shared/exported filesystems, server constructs pseudo filesystems to span name space Client can still individually mount portions of the name space Differing security policies can cover different parts of exported space NFS: Its applications and future - LISA 04

264 Firewall friendly PORTMAP MOUNT NFSv2/v3 LOCK/NLM STATUS ACL* Port 111 Dynamic Port 2049 Dynamic Dynamic Dynamic } NFSv4 Port 2049 TCP NFS: Its applications and future - LISA 04 NFS Version 4 availability Network Appliance (Feb. 03), Hummingbird (late 02), Linux (via SuSE mid- 04), in RedHat Fedora (May 04), RHEL 4.0 Dec. 04) Must be explicitly enabled Solaris 10 imminent (uh, yesterday) On by default IBM AIX 5L V5.3 BSD (Darwin) (date?) NFS: Its applications and future - LISA 04

265 The future of NFS Version 4 Enhanced Sessions based NFS (correctness) CCM - session-based security for IPsec Directory delegations Migration/replication completion Core protocol defines client/server failover behaviour Definition of the server-server file system movement Transparent reconfiguration from client viewpoint Proxy NFS file services (NFS caches) Uniform global name space Features exist in the core protocol to support this Support for multi-realm security NFS: Its applications and future - LISA 04 Scalability: Attacking the I/O bottleneck Remote Direct Memory Access Bypasses CPU on client and server for networking Reduces the memory bottlenecks on high speed networks such as Infiniband or 10G ethernet. See the NFSv4.1 RDMA extensions Parallel storage extensions (pnfs) Intelligent clients are given (limited!) direct access to the storage net using block protocols such as iscsi etc. Bypasses server altogether for block file operations, but not for metadata NFS: Its applications and future - LISA 04

266 Drill down: Linux NFS NFS: Its applications and future - LISA 04 Linux 2.6 Recent NFS (3 and 4) client changes Support for O_EXCL file creation Cached ACCESS permission checks Intents allow unnecessary lookups and permissions checks to be optimized away Asynchronous read/write improvements Removed 256 page read/write request limit Async support also for r/wsize < PAGE_SIZE DIRECT_IO / uncached I/O RPCSEC_GSS NFS: Its applications and future - LISA 04

267 Linux 2.6 NFS Version 4 Framework added October 02 to Linux 2.5 Additional cleanups and features added over past year Performance work to get to V3 levels Delegations Framework for atomic OPEN Memory management issues Basic state management State recovery Bug fixes and code stabilization NFS: Its applications and future - LISA 04 Linux 2.6 NFS Version 4 Goal is to stabilize basic V4 and add further advanced features Will remain an EXPERIMENTAL feature dependent on testing No NFS V4 ROOT (for diskless operation) Must address the user experience State recovery under network partition NFSv4 perceived as complicated to administer Strong security, names spaces, migration/replication, etc. NFS: Its applications and future - LISA 04

268 Linux-2.6 Generic server changes Adds upcall mechanism Adds RPCSEC_GSS support (version 3 and 4) Id mapper Improves support for exports to NIS or DNS domains. Zero-copy NIC support Accelerates NFS READ calls over UDP+TCP 32K r/wsize support NFS: Its applications and future - LISA 04 Linux NFS V4 - What s in All NFS ops and COMPOUND GSSAPI LIPKEY/SPKM and Kerberos (authentication and integrity) Single server mount Locking Client side delegations (read and write) Does not cache byte-range locks ID mapping (UID to names - user@domain) O_DIRECT bug fixes AutoFS support NFS: Its applications and future - LISA 04

269 Linux NFS V4 - TODO Security RPCSEC_GSS privacy support SECINFO Keyrings Server side delegation support (going in now) Migration/replication client support Nohide mount point crossing - RFC 3530 compliance ACL support Named attribute support Non-Posix server support (missing attribute simulation) Volatile file handle handling Global name space NFS: Its applications and future - LISA 04 What about iscsi? NFS: Its applications and future - LISA 04

270 iscsi background SCSI Protocol iscsi TCP FCP FibreChannel Parallel Bus IP TCP/IP Transport for SCSI command sets SCSI block protocol access Internet standard - RFC 3720 iscsi is a direct replacement for FCP FCP is the SAN fabric today NFS: Its applications and future - LISA 04 The important thing about iscsi is SCSI The important stuff and common with FCP // Ethernet Header IP TCP iscsi SCSI Data CRC Checksum Addressing and routing and security NFS: Its applications and future - LISA 04

271 So, iscsi is a replacement for NFS, right? In the first iscsi presentation I made to a prospect, this was first thing out IT manager s mouth I used to say No, but the thoughts underlying the question are interesting NFS: Its applications and future - LISA 04 iscsi value proposition Leverage existing Gigabit 10 Gigabit networking infrastructure Leverage existing rich set of management tools Leverage existing base of skilled personnel Reducing costs NFS: Its applications and future - LISA 04

272 iscsi points iscsi software drivers freely and ubiquitously available Windows platforms Linux, and other *ixes HBAs and TOEs Scale performance from software solution HW assist full offload Saying performance and iscsi in the same breath though misses the point Performance is not always the primary issue (else use FC SAN) Many application deployments have spare (CPU and I/O) capacity Optimize performance as needed NFS: Its applications and future - LISA 04 For some, iscsi represents the path of least resistance It is semantically equivalent to FC SAN (SCSI) But more familiar because of TCP/IP and Ethernet - so friendly outside the data center Application migration is trivial My remote booting desktop from FC to iscsi Provides a path for easily reclaiming FC port capacity by moving less critical apps to iscsi With some of the important cost benefits of NAS NFS: Its applications and future - LISA 04

273 iscsi is part of a solution Cost effective alternative to captured storage Windows HCL afterburner A place to move less critical SAN applications freeing up FC capacity The early adopter approach Friendly outside the data center And manageable! Easily envision applications like remote C: drive Remind me to tell you a story An additional tool in the storage toolbox NFS: Its applications and future - LISA 04 Cluster computing and Linux NFS: Its applications and future - LISA 04

274 The Old Way Imagine Charlton Heston in a chariot. NFS: Its applications and future - LISA 04 The New Way Imagine an airplane full of chickens. NFS: Its applications and future - LISA 04

275 The GRID what is it? Sets of toolkits or technologies to construct pools of shared computing resources Screen-savers that use millions of desktop machines to analyse radio telescope data. Vertical applications in the enterprise on a large Linux cluster. Middleware that ties together geographically separated computing centres to satisfy the exponentially increasing demands of scientific research. To many, Grid computing and Cluster computing are synonymous. NFS: Its applications and future - LISA 04 Modern numerology (this is not important) <9 1 10? The preferred architecture for commodity computing (and oddly, the number of years it took the Red Sox to win the World Series) Number of physical processors in commodity pizza boxes (poor man s blade) Maximum expected nodes in a Linux database cluster Typical number of Linux nodes in a render or ECAD simulation farm today Expected number of nodes in Linux compute cluster in next two years? Number of filers per 1000 Linux nodes in GRID There is only one - Linus Torvalds The number of trusted minions to Linus NFS: Its applications and future - LISA 04

276 Scalable compute cluster Linux is ahead of the game growing infrastructure, expertise and support It's all about choice! No! It s all about freedom! Well, no actually, it s all about cost. NFS: Its applications and future - LISA 04 The rise of the Linux compute cluster Driven by cost Particularly the cost of cheap commodity CPUs Monolithic application servers have no chance competing with the Tflop/price ratio Choices exist *BSD development marches on Sun is renewing its investment in Solaris x86 Even Windows has a market share NFS: Its applications and future - LISA 04

277 Compute cluster points The x86 platform won Any questions? Support costs may still be significant...but largely offset by the hardware cost savings - it s about leveraging small MP commodity x86 hardware Some customers choose to pay more for better quality in order to lower support costs and improve performance Maturation of free software - paying for support For Unix environments, NFS is the cluster file sharing protocol of choice Customers simply want storage solutions that scale as easily as their compute clusters NFS: Its applications and future - LISA 04 Applications: Batching Large partitionable tasks that may run as parallel threads Clusters may include up to several thousand nodes Often uses LSF and other queueing tools to manage jobs Read-only data may be shared, but writing is partitioned to avoid locking issues. Typical applications include search engines, e- mail, animation and rendering, scientific simulations, and engineering NFS: Its applications and future - LISA 04

278 Other applications Scalable databases Fewer nodes than the batch case: up to a hundred or so. High degree of write sharing necessitates heavier use of locking Data integrity is often supported by specialized fencing and recovery techniques that may again need support in the underlying filesystem. Extremely I/O intensive high performance computing GRID computing NFS: Its applications and future - LISA 04 Cluster-friendly features in NFSv4 Stateful model...but recovery remains client driven! in case of a recoverable server outage, clients just retry operations until they succeed Leases solve NFS Version 3 state leakage problems due to client outages Cures the lost locks syndrome But introduces new issues on network partition Delegations allow for aggressive caching Stops short of a full coherency model, though Callback mechanism is firewall-unfriendly. NFS: Its applications and future - LISA 04

279 Cluster-friendly features in NFSv4 GRID friendly Obligatory support for strong security makes NFS deployment over the internet possible Including public key Includes support for data encryption Firewall-friendly: only port 2049 needs to be opened there are no side-band mount/lock/... protocols callbacks/delegations are not mandatory NFS: Its applications and future - LISA 04 NFS in the short term future Aim to provide robust global name spaces Adaptations to work with GRIDs GRIDNFS project to adapt NFS to the Globus GRID toolkit In the long run, we need improved caching models for use with high latency environments Performance improvements using hardware assisted networking NFSv4.1 includes support for RDMA NFS: Its applications and future - LISA 04

280 Storage in a clustered environment Scale my storage as easily as I can scale my CPUs. Data sharing Data must be accessible to all compute nodes (high) data availability No single point of failure Reliable data handling No data corruption Security In particular secure transport of data between compute and storage nodes Support commodity TCP/IP and ethernet networking Performance NFS: Its applications and future - LISA 04 Scaling yet further Virtualization techniques permit horizontal scaling of storage using the current protocol See NetApp/Spinnaker Parallel NFS NFSv4 extensions to scale beyond storage network bandwidth limits Allow for striping files across several storage units Explore SAN and object storage integration (NFSv4 as metadata server) NFS: Its applications and future - LISA 04

281 Putting NFS in perspective NFS: Its applications and future - LISA 04 Let s put this in perspective Wow. Michaelangelo, great statue - was that a 7 inch chisel you used? Great flick Welles, what camera did you use? Great quarter you guys had! Did you use NFS to access your financial data? NFS: Its applications and future - LISA 04

282 It s about applications Applications drive storage choices What does the application vendor support? What do they recommend? For example, Exchange is driving iscsi in the Windows environment Mix of applications in a single enterprise There is no one perfect storage approach There s likely more than one vendor NFS: Its applications and future - LISA 04 It s about data management Integration of applications with data management Key applications like Exchange - applicationdriven backup/restore Fertile ground for virtualization - blurring line between client application and storage Disaster recovery Finding data when you need it Higher level data organization and grouping? NFS: Its applications and future - LISA 04

283 Non Disruptive Migration Global Name Space Vol A Vol B Vol B Vol C NearStore Vol B Migration: Fast (on-demand) Transparent to users Migration of Name Space or backend Volumes Migrate aged data to NearStore NFS: Its applications and future - LISA 04 It s about cost Ability to (re)provision, expand and manage storage to maintain high utilization will most affect overall cost long term Leveraging commodity networking iscsi and NFS are similar here Primary storage and Nearline support for all storage access - transparently Migration and replication Consolidation to reduce management costs NFS: Its applications and future - LISA 04

284 Existing Storage Hierarchy Primary Storage Servers Price/Performance Gap Archive Targets Optical Library LAN or WAN Storage Network Storage Network Heterogeneous Storage Filers Tape Library $$$$$/ MB $/ MB Very Fast Slow The traditional two-tier hierarchy creates a large price/performance gap Current challenges need a storage solution that fills the gap NFS: Its applications and future - LISA 04 Economics of recovery Acceptable downtime is application dependent Simplest but most costly - approach is full mirror online ATA drives are cheap NFS: Its applications and future - LISA 04

285 Emerging Storage Hierarchy Primary Storage Servers Backup Target/ Reference Data Archive Targets Optical Library LAN or WAN Storage Network Storage Network Heterogeneous Storage Filers NearStore TM Appliance $$$$$/ MB $$/MB $/ MB Very Fast Fast Slow Tape Library NFS: Its applications and future - LISA 04 Understanding the context around NFS Regardless of data access protocol, similar issues and solutions in data management Common management regardless of access method That other operating systems (besides Linux) drive fundamental architecture decisions Blade provisioning via NFS is a non-starter perhaps - because of multi-os support Enter iscsi - the least common denominator - the Windows splash effect People don t buy NFS servers They buy Oracle or other applications They build application compute clusters And manage the data around it - with NFS perhaps NFS: Its applications and future - LISA 04

286 beepy, are you saying there is no difference between storage architectures? NFS: Its applications and future - LISA 04 Differences are important NAS protocols define a file view - higher level organization and semantics Enables sharing Enables large compute clusters (>5,000 nodes) iscsi, like FC SANs, provides simpler SCSI block interface Higher level semantics via explicit file system encapsulation Sharing via layered cluster file system (complexity and cost?) Customers will use and continue to explore a variety of approaches NFS: Its applications and future - LISA 04

287 The challenge for NFS NFS: Its applications and future - LISA 04 Other than that Mrs. Lincoln NFS = Network File System NFS = Not For Speed NFS = Not For Security But an NFS vendor may sit there and think My side of the boat is dry! Exactly. First impressions last a long time, and hard to turnaround once set. NFS: Its applications and future - LISA 04

288 So, back to NFS - what do customers want? No surprises Customers really want a better NFS Version 3 Are we prepared to provide support for NFS Version 4? Reliability Testing Scalability Playing well with others Agreeing on common administration models Agreeing on common features (else we will drop things from spec in IETF) Security Administration needs to be simplified simplified simplified Performance is at bottom of list I think NFS: Its applications and future - LISA 04 Additional Resources NFS (Linux NFS) (NFS Version 3) (NFS and Linux clusters) NFS Version (OLD) SAN and iscsi case studies SAN Performance Technical Report NFS: Its applications and future - LISA 04

289 Questions NFS: Its applications and future - LISA 04

290 9 Partage de fichiers NFS 9.17 Annexe 4 Chapitre 9 Partage de fichiers NFS 9.17 Annexe 4 Ci joint dans la version imprimée de ce cours, un document sur le procédé KICKSTART. c T.Besançon (v ) Administration UNIX ARS Partie / 468

291 Kickstart Fedora Kickstart Red Hat Magazine 2004 N 3 Utilisez Fedora Kickstart en réseau pour effectuer une nouvelle installation en peu de temps. L inux est de plus en plus utilisé, non seulement au sein de très grands réseaux, mais également dans certaines situations où il est nécessaire d installer de nombreux ordinateurs ou bien de procéder à une nouvelle installation en un laps de temps réduit. Par exemple, dans le cas de clusters d ordinateurs ou d une réinstallation manuelle de stations de travail dans les réseaux d entreprises, suite au dysfonctionnement d un disque dur... Opérations qui demandent beaucoup trop de temps et d énergie au personnel. Toutefois, en cas d installations exécutées normalement de façon automatique, dans le but de réinstaller sur le disque dur d un serveur ou d un poste de travail des données très importantes, utilisez Kickstart avec prudence parce qu une simple erreur dans le fichier de configuration pourrait provoquer la perte de ces données. De ce fait, soyez très vigilant lors de la création du fichier de configuration Kickstart. Types d installations Kickstart, disponible dans l Installer de Fedora, est la solution idéale à ces problèmes, puisqu il permet d effectuer automatiquement une installation complète, y compris l exécution de scripts adaptés pour une configuration ou des besoins spécifiques. Pourquoi Kickstart? Par rapport aux autres méthodes d installation, Kickstart offre de nombreux avantages. Il est possible de configurer de façon centralisée des installations types pour des groupes d ordinateurs, étant donné que les divers composants hardware reconnus par le programme d installation de Red Hat peuvent être ignorés dans le fichier de configuration. La méthode Kickstart ne requiert, comme modèle d installation, qu un seul fichier de description qui utilise une syntaxe spécifique pour indiquer à l Installer les opérations qu il doit exécuter. Ce fichier de description introduit le second grand avantage de la méthode Kickstart : aucune image d installation propre à chaque PC n est nécessaire, comme dans le cas de presque tous les autres programmes d installation ou de rétablissement d installation. Kickstart, en effet, n utilise que les supports d installation de Fedora. En outre, le fichier de description est substantiellement indépendant de la version. Il est, par conséquent, possible d utiliser le même fichier, avec cependant quelques limites, comme par exemple, le choix des paquetages, pour les installations courantes et futures de Fedora. En pratique, Kickstart installe à chaque fois un système avec les mêmes caractéristiques mais utilise comme base la distribution correspondante et le programme d installation relatif. Pour le stockage des fichiers de description d installation, Kickstart offre deux méthodes différentes. La méthode basée sur des disques nécessite un disque de lancement approprié pour chaque ordinateur installé. Cette méthode prévoit l utilisation, soit du disque de lancement d installation (créé sur la base des fichiers des images boot. img) de la distribution Linux utilisée, soit du fichier de description. la méthode basée sur le réseau prévoit le chargement du fichier Kickstart à partir d un serveur approprié. Comme support de lancement, il est possible d utiliser un disque ou, par exemple, un PXE ou encore un environnement créé pour le lancement à partir de cartes réseau déterminées. La première méthode est conseillée pour de petites installations, tandis que pour des installations de plus grande ampleur, il est souhaitable d utiliser la seconde méthode : celle basée sur réseau. Dans ce cas, il n est pas nécessaire de consacrer trop de temps à la personnalisation du serveur pour Kickstart. Pour démarrer une installation automatique, il faut de toute façon copier les supports d installation de Red Hat sur le serveur, et ce, indépendamment du type d installation désirée : basée sur disque ou sur réseau. Dans cet article, l installation basée sur CD-ROMs ne sera pas abordée, puisqu elle ne permet pas l installation automatique sans une intervention minimale de l utilisateur. 65

292 Fedora Kickstart Red Hat Magazine 2004 N 3 Un répertoire du disque (ext2) Un partage NFS Un partage http (Webserver) Un partage FTP Les CD-ROMs d installation de Red Hat (qui n autorisent pas l installation automatique). Tableau 1 : Supports de base d installation. Copie des supports d installation sur le serveur Pour copier les supports d installation sur le serveur, il est avant tout nécessaire que la partition correspondante dispose d espace suffisant : environ 2 Go. Il faut ensuite créer un nouveau répertoire dans lequel sauvegarder les supports d installation. Dans l exemple suivant est utilisé le répertoire : /kickstart/fc1a. Insérez le premier CD de Fedora dans le lecteur CD-ROM approprié, et tapez ensuite : #mount /mnt/cdrom #cp af /mnt/cdrom/redhat /kickstart/fc1a #cp /mnt/cdrom/release-notes* /kickstart/fc1a (n exécutez cette dernière commande que pour le premier CD-ROM) #umount /mnt/cd-rom Répétez l opération pour les 2ème et 3ème CDs d installation. De cette façon, les paquetages d installation et tous les programmes binaires de Fedora sont copiés dans le répertoire kickstart/fc1a. Partages du serveur Exportez la base d installation afin de permettre aux clients de la repérer sur le serveur. Pour exécuter l exportation, de nombreuses méthodes sont disponibles, toutefois les partages NFS ou HTTP sont les plus communément utilisés. Dans l exemple suivant, le partage NFS sera utilisé car, dans le cas d une installation basée sur réseau, Kickstart requiert le fichier de configuration correspondant sur partage NFS. En principe, il est également possible d exécuter des installations par le biais d autres supports de base. Par exemple, un ordinateur qui fournit des données partagées sur le Web pourrait être utilisé comme support de base pour d importantes installations. Pour créer le partage NFS, il faut vérifier que les paquetages RPM nécessaires sont disponibles sur le serveur. Pour cela, tapez la commande rpm ci-dessous : #rpm q portmap nfs-utils La commande doit renvoyer les noms des deux programmes. Contrôlez que le répertoire complet /kickstart est bien exporté (voir le Listing 1). /kickstart Listing 1 : /etc/exports *(ro,all_squash) Par la suite, il sera possible de sauvegarder les fichiers de configuration de Kickstart dans ce répertoire. Activez le service NFS en tapant : #/etc/init.d/nfs start Si un message d erreur comme celui-ci apparaît : Starting NFS quotas: Cannot register service: RPC: Unable to receive... il faudra aussi lancer le mapper des ports via la commande : #/etc/init.d/portmap start Configuration DHCP Pour effectuer une installation à partir d un serveur disposant des fichiers de configuration, il faut qu un serveur DHCP soit disponible sur le réseau de l ordinateur à installer. Dans l Encadré 2 est représenté un fichier de configuration un peu plus complexe pour le daemon DHCP, qui se base sur 2 sous-réseaux reliés de façon logique au calculateur. Tous les sous-réseaux dont est responsable le daemon DHCP d un ordinateur doivent toujours être directement reliés, par exemple au moyen d une carte réseau, avec l ordinateur correspondant. Assurez-vous, en outre, qu un seul serveur DHCP est utilisé par sous-réseau, car l utilisation de plusieurs serveurs pourraient distribuer les données de configuration de façon arbitraire. Pour installer et activer le daemon DHCP, éventuellement non installé, tapez les commandes suivantes : #up2date dhcp #vim /etc/dhcpd.conf (élaboration du fichier de configuration) #/etc/init.d/dhcp start Si le serveur DHCP est actif, l installation Kickstart recevra l adresse IP du serveur DHCP et choisira ensuite le fichier de configuration Kickstart en service. L option next-server spécifie le serveur NFS qui contient le fichier, tandis que l option filename assigne au programme d installation le nom du partage NFS. Kickstart cherchera ensuite dans ce répertoire le fichier de configuration selon le schéma suivant : < Adresse IP >-kickstart Naturellement, il est également possible de créer dans ce répertoire des liens symboliques ainsi que des groupes. Il est, par exemple, possible de créer un fichier Kickstart pour un groupe de stations de travail et l utiliser seulement pour les adresses IP intéressées. L exemple suivant illustre la création de ce type de fichier Kickstart. 66

293 Fedora Kickstart Red Hat Magazine 2004 N 3 # Définitions globales: option domain-name reseau-essai.redhat.de ; default-lease-time 3600; max-lease-time 7200; ddns-update-style none; # Definition du premier sous-réseau (tous les # paramètres spécifiés dans le sous-réseau, # ne sont valables que pour tel réseau). subnet netmask { option routers ; # Passerelle prédéfinie option domain-name-servers ; # Server DNS option subnet-mask ; # Données du masque de sous-réseau next-server ; # Adresse du serveur NFS avec le fichier # de configuration Kickstart filename /kickstart/ ; # Répertoire qui contient le fichier # Kickstart (Schéma: $[Adresse IP client]). range ; # Adresse assignée par le serveur # (dans ce cas de à ) # Pour une installation et une # diversification efficaces, les clients à # installer doivent toujours # avoir la même adresse IP, ceci parce # que le fichier Kickstart est sélectionné # en fonction de l adresse IP. Création d un fichier de configuration avec redhatconfig-kickstart La version courante de Fedora comprend l outil redhat-config-kickstart. Si cet outil n est pas encore installé sur le système, utilisez up2date pour exécuter une telle opération. Pour le lancer, tapez redhat-configkickstart. Apparaîtra alors une fenêtre de lancement (Figure 1) qui offre les mêmes options qu une installation Fedora. L outil permet, soit de modifier un fichier de configuration existant, à condition qu il ait été créé avec Red Hat Tool, soit d en créer un nouveau. Explorez les éléments du menu, visibles à gauche. Encore une fois, faites très attention si des données importantes ont été installées sur l ordinateur. L élément du menu Configuration de base comprend des options très importantes comme, par exemple, celle relative à la Composition vocale ou à l établissement du mot de passe principal. L option Méthode d installation permet de choisir diverses bases d installation et, en outre, d effectuer une nouvelle installation ou bien mettre à jour une installation déjà existante de Fedora. Les options du bootloader comprennent les bases pour la configuration de Grub/Lilo. Toutefois à cet endroit, il n est pas possible d exécuter une configuration détaillée comme, par exemple, celle d un système dual-boot. La section Information sur la partition permet de spécifier le partitionnement. A cet effet, deux options spécifiques de Kickstart sont disponibles : ondisk qui permet de créer une partition déterminée sur un disque dur spécifique et onpart qui permet, par contre, d utiliser une partition déjà existante. group { use-host-decl-names on; host workstation1 { # Nom de l ordinateur hardware ethernet 00:cb:0b:18:10:45; # Adresse MAC de la carte réseau fixed-address ; # Adresse IP à assigner à l ordinateur } #? Ici il est possible de spécifier #? d autres ordinateurs. } # Fin de l adresseip de l ordinateur spécifié } # Fin du sous-réseau # Il est possible de spécifier d autres # sous-réseaux en suivant le même schéma subnet netmask { #? Définitions? } Encadré 2 : /etc/dhcpd.conf Figure 1 : Configuration de base. 67

294 Fedora Kickstart Red Hat Magazine 2004 N 3 Puisque Kickstart utilise le programme d installation de Red Hat et, par conséquent, son système de détection hardware, il n est pas nécessaire de spécifier les paramètres hardware dans les éléments de menus cités ciaprès. Pendant l installation, les drivers appropriés seront configurés automatiquement. En général, les données d authentification ne doivent être modifiées que lorsqu on utilise un système d authentification de réseau de type NIS ou LDAP. Si, par contre, le système appartient à un réseau local protégé, il est possible de désactiver l option Configuration Firewall. En ce qui concerne la Configuration de X, avant d activer le lancement automatique du système X Window, vérifiez si vous avez sélectionné la configuration appropriée. Bien qu aujourd hui il soit possible de résoudre presque tous les problèmes hardware, l auteur comme le distributeur devraient faire très attention avant de satisfaire des demandes hardware particulières. La configuration XFree, après que l installation ait été effectuée avec l outil redhatconfig-xfree86, représente le meilleur compromis. L outil graphique permet de sélectionner les paquetages mais seulement des groupes de paquetages, comme illustré à la Figure 2. Si l on désire spécifier un réglage particulier, par exemple sélectionner un unique paquetage, il faudra modifier manuellement le fichier de configuration. Cet argument est traité plus en détail dans un autre paragraphe du présent article. la configuration de Kickstart en cliquant sur Fichier-> Enregistrer fichier. Le fichier, obtenu pour le réseau utilisé ici comme exemple est semblable à celui illustré dans l Encadré 3. Un fichier de ce type sera présenté plus en détail dans le paragraphe suivant. # Généré par le configurateur Kickstart lang en_us # Langue du système langsupport de_de --default=en_us # Langues à installer keyboard de-latin1-nodeadkeys # Type de clavier mouse generic3ps/2 # Type de souris timezone Europe/Berlin # Fuseau horaire rootpw --iscrypted $1$To1ZVQMJ$SjOFyd7tZ2y. fr.g2omt// # Password principal reboot # Redémarrage après l installation text # Installation en mode texte install # Exécute une réinstallation et non # pas une mise à jour nfs --server= dir=/kickstart/ FC1A # Support d installation bootloader --location=mbr # Endroit où installer le chargeur de démarrage zerombr yes # Efface le MBR principal clearpart --all --initlabel # Elimine toutes # les partitions existantes et la table des # partitions # Informations sur le partionnement part /boot --fstype ext3 --size asprimary part / --fstype ext3 --size 1 --grow part swap --size 512 # Options d authentification Figure 2 : Sélection des paquetages. La même configuration sera utilisée pour les scripts %pre et %post. Les scripts permettent d apporter des modifications définies par l utilisateur. Il est possible, par exemple, de contrôler les données en réseau, comme envoyer un message avec le texte : «Administrateur, l ordinateur a été installé» et plus encore, sans poser de limites à la créativité. Après avoir réglé toutes les options désirées, enregistrez auth --useshadow --enablemd5 # Configuration de l interface réseau network --bootproto=static -- ip= netmask= gateway= nameserver= device=eth0 68

295 Fedora Kickstart Red Hat Magazine 2004 N 3 firewall --disabled # Désactive les règles du # firewall # Configuration de Xfree86 xconfig --depth=32 --resolution=1280x defaultdesktop=kde # Selection des paquets (résoud automa- # tiquement les dépendances) %packages X Window KDE Desktop Graphical Administration Printing Support Encadré 3 : ks.cfg Création manuelle d un fichier de configuration Pourquoi créer manuellement un fichier de configuration, opération plutôt compliquée, alors que tout ou presque est disponible dans l outil Fedora? Les réponses à cette question sont multiples. Soit parce que le programme d installation ne reconnaît pas un composant hardware, telle une carte réseau, soit parce que l utilisateur désire personnaliser le système avec des scripts. Même la sélection d un simple paquetage de la distribution pour une installation doit être effectuée manuellement. Il est conseillé de se baser sur un fichier créé avec l outil GUI, de façon à éviter des erreurs de construction ou de mise en place des éléments un par un qui pourraient provoquer des interruptions dans le processus d installation, ou bien des demandes d informations manquantes ou erronées dans la description de Kickstart. Dans tous les cas, après quelques tentatives, l erreur est presque toujours repérée, preuve de l efficacité de cette méthode d installation. Dans l Encadré 4, un exemple des grandes possibilités de Kickstart est présenté, en utilisant une vieille carte réseau non reconnue et un script post installation. lang de_de langsupport de_de keyboard de-latin1-nodeadkeys mouse generic3ps/2 timezone Europe/Berlin rootpw redhattestpasswort text install nfs --server= dir=/kickstart/fc1a device ethernet 3c509 -opts io=0x320, irq=7 # Carte Ethernet bootloader -uselilo -linear --location=mbr zerombr yes # Efface le MBR clearpart --all --initlabel part /boot --fstype ext3 --size 80 --asprimary part / --fstype ext3 --size 1 --grow \ --onpart hda2 --maxsize 2000 part /var/www --fstype ext2 --size part swap --size 512 auth --useshadow --enablemd5 network --bootproto=dhcp firewall --disabled skipx # Ignore la configuration de X # Sélection des paquets (résoud # automatiquement les dépendances) %packages Administration Printing Support lynx httpd %post # Les scripts CHROOT sont exécutés # dans l installation prête) # Introduction du nom du serveur: cat > /etc/resolv.conf <<EON search domain1.de domain2.de unterdomain. domain2.de nameserver nameserver EON # Activation de l accès aux disques durs DMA cat > /etc/sysconfig/harddisks <<EOF USE_DMA=1 MULTIPLE_IO=16 EIDE_32BIT=3 LOOKAHEAD=1 EXTRA_PARAMS=-X68 EOF chkconfig lpd off chkconfig httpd on mail -s Fin Install Ordinateur \ root@mailserver < /dev/null Encadré 4 : ks-avancé.cfg 69

296 Fedora Kickstart Red Hat Magazine 2004 N 3 Les informations relatives aux groupes de paquetages utilisés dans le fichier description, comme par Internet, sont disponibles dans le fichier : /kickstart/fc1ared Hat/base/comps Il est important d observer que l Installer de Red Hat n effectue que l installation de paquetages signés électroniquement Red Hat. Tous les paquetages tiers doivent être installés dans la section %post au moyen de la commande rpm. C est-à-dire qu il permet d obtenir une installation de base toujours stable. Préparation d une disquette de démarrage Pour créer une disquette de démarrage, insérez le premier CD de Fedora dans le lecteur de CD-ROM approprié et une disquette 1,44 formatée dans le premier lecteur de disquettes de l ordinateur. Tapez ensuite les commandes suivantes : #mount /mnt/cdrom #dd if=/mnt/cdrom/images/bootdisk.img \ of=/dev/fd0 #umount /mnt/cdrom De cette façon, une image générique pour les installations a été écrite sur disquette. Une telle disquette pourra par conséquent être utilisée pour les installations standard de Linux avec une base réseau, ou même pour les installations automatiques de Kickstart. Pour lancer automatiquement le mode Kickstart linuxks, montez la disquette qui vient d être mentionnée en tapant : #mount /mnt/floppy Suivez la méthode appropriée parmi les deux rapportées cidessous : 1. Fichier Kickstart sur disquette (basé sur disquette) En utilisant l éditeur de texte désiré, modifiez les deux lignes suivantes dans le fichier /mnt/floppy/syslinux.cfg prompt 0 default linux ks=floppy 2. Fichier Kickstart en réseau (basé sur réseau) Modifiez les deux lignes suivantes dans prompt 0 default linux ks A la fin de cette opération, il sera possible de copier le fichier de configuration de Kickstart de la disquette dans le fichier /mnt/floppy/ks.cfg Daemontez de nouveau la disquette en tapant : #umount /mnt/floppy Préparation d un CD-ROM de lancement Dans le cas où les drivers réseau nécessaires pour l installation ne seraient pas compatibles avec une disquette, nous vous conseillons d utiliser un CD-ROM de lancement. Mastérisez le fichier image ISO boot.iso, disponible dans le même répertoire de la disquette de démarrage, sur un CD vierge et activez ensuite le lancement à partir du CD-ROM dans le BIOS de l ordinateur à installer. Il suffit de taper linuxks pour activer le mode Kickstart. Il est conseillé de ne pas modifier le fichier syslinux.cfg pour le lancement automatique, parce qu il arrive souvent d oublier les CD-ROM dans le lecteur et cela conduirait à une nouvelle installation à chaque démarrage. Pour remédier au problème, surtout dans les cas d installations importantes, il est utile d envoyer un message à la fin du processus d installation qui indique à l administrateur sur quels ordinateurs l installation a déjà été effectuée. Lancement du programme d installation par l intermédiaire de PXE PXE est sans doute la façon la plus élégante pour exécuter d importantes installations en réseau. Il est particulièrement indiqué comme protocole pour les nombreuses cartes réseaux installées dans les ordinateurs d entreprises. PXE est en mesure de charger le programme d exécution de Fedora, directement à partir du réseau, en évitant par conséquent l utilisation d un support de lancement. Il est aussi possible d inclure des options au lancement qui permettent à l utilisateur d effectuer le rétablissement automatique du système. Puisqu il n est pas possible dans cet article d illustrer en détail l environnement PXE, nous vous renvoyons sur le site Kickstart.html qui contient un article intéressant de Alf Wachsmann avec des instructions détaillées sur l utilisation de Kickstart. Procédure d installation Après avoir inséré le support et démarré l ordinateur, Fedora commence par lancer l Installer Anaconda. En mode Kickstart avec configuration basée sur réseau, Kickstart tente avant tout d obtenir sa propre adresse IP au moyen d une requête DHCP et ensuite de charger le fichier de description correspondant à partir du serveur. Si cette procédure est exécutée correctement, la configuration de base peut être estimée complète. Pendant l installation, il est possible de tenir sous contrôle l image normale d installation sur la console 1 (ALT+F1). D autres fonctions de l Installer sont, quoi qu il en soit, disponibles à partir d une phase déterminée du processus d installation. Sur la console 2 (ALT+F2) il sera possible de trouver une console Bash pour la récolte d informations pour le debugging. La console 3 permet de relever d éventuels problèmes hardware car c est souvent ici que sont visibles les modules du kernel. C est pourquoi, en cas de problèmes 70

297 Fedora Kickstart Red Hat Magazine 2004 N 3 durant l installation, il est utile de les contrôler via cette console. En outre, contrôlez toujours le fichier /var/log/messages dans le serveur approprié. En rappelant la commande : #tail-f /var/log/messages Il sera possible de visualiser «en live» et par conséquent de contrôler les commandes du processus d installation qui ont été exécutéés correctement et celles qui, inversement, présentent des problèmes. Par exemple, le manque de réception de signaux DHCP ou NFS venant de l ordinateur client est attribuable à un problème hardware sur le réseau. Le paquetage sniffer tcpdump permet également, par exemple, de diagnostiquer des problèmes hardware s il ne relève aucun trafic du client. Boot à partir du CD-ROM/Disquette DHCP: Détermination de la configuration de réseau NFS: Montage du répertoire /kickstart Interprétation de la configuration de Kickstart Partitionnement et formatage du disque dur %pre-script #showmount -e <Adresse IP du serveur> Il est toujours possible de visualiser tous les partages NFS du serveur intéressé. Le problème est presque toujours dû à des erreurs d introduction contenues dans ces partages. Un autre type d erreur très fréquemment rencontré dans les installations de Kickstart est une syntaxe incorrecte dans le fichier de description, qui provoque une interruption du processus d installation ou bien également, dans bien des cas, une erreur d un script python. Ces erreurs se trouvent presque toujours dans la description de base du système plutôt que dans les scripts %pre ou %post. Par conséquent, la meilleure solution consiste à effectuer une comparaison avec les scripts déjà contrôlés et corrects. Si quelques scripts ont été créés manuellement avec la GUI, il sera possible de résoudre encore plus rapidement les problèmes éventuels grâce à la familiarité acquise. Installation de paquetages séparés Sur des groupes d ordinateurs installés avec Kickstart, il pourrait être nécessaire d installer par la suite de nouvelles versions de logiciels. Dans ce but, utilisez l outil de gestion des paquetages RPM, lesquels peuvent être installés dans le script %post. Les paquetages RPM permettent d effectuer des personnalisations complexes qu il sera possible par la suite, de supprimer au moyen des commandes standard, ainsi que d ajouter quelques utilisateurs ou de créer un profil pour des activités déterminées. L Encadré 5 illustre un exemple pour un fichier spec qui permet d ajouter un utilisateur qui pourra être supprimé à n importe quel moment. Le but de l exemple est de contrôler à distance un numéro vert. Pour exécuter cette opération, il a été fait référence à un script log dans une archive tar du fichier spec. L installation des paquetages RPM avec le script -%post de Kickstart permet d étendre également l accès à tous les autres ordinateurs. Summary: RemoteSupportUser Name: remotesupport Version: 0.1 Installation et configuration des paquetages Release: 1 License: GPL Séquence d installation Kickstart. Sources d erreur %post-script et redémarrage Les partages erronés NFS représentent une source d erreur bien connue dans les installations automatiques, puisqu ils ne permettent pas à l hôte d accéder aux fichiers de configuration et de distribution de Kickstart. Avec la commande : Group: System Environment/Base Source0: %{name}-%{version}.tar.gz Buildroot: %{_tmppath}/%{name}-%{version}- buildroot Packager: Frederik Bijlsma <[email protected]> Provides: remotesupportuser 71

298 Fedora Kickstart Red Hat Magazine 2004 N 3 Requires: /bin/mail BuildArch: noarch %description NULL %prep %setup -q %build #empty %install rm -rf %{buildroot} mkdir -p %{buildroot}/home/remotesupportuser install -m 0755.logrc %{buildroot}/home/ remotesupportuser install -m 0755.bash_profile %{buildroot}/ home/remotesupportuser %clean rm -rf %{buildroot} %pre userdel remotesupportuser >/dev/null 2>&1 userdel `awk -F: $3 == «9993» /etc/passwd awk -F: { print $1 } ` >/dev/null 2>&1 Conclusion Le présent article a décrit une installation Kickstart basée réseau, reposant sur un serveur DHCP fonctionnel et un serveur NFS disposant des fichiers de description Kickstart et des supports d installation. Puisque beaucoup d entreprises utilisent déjà ces deux types de serveurs au sein de leurs réseaux, l utilisation de Kickstart ne devrait pas se révéler trop dispendieuse, excepté les quelques modifications nécessaires à apporter. L utilisation de Kickstart est recommandée lorsqu il est nécessaire d installer simultanément de nombreux ordinateurs comme, par exemple, pour les clusters. Avec un réseau très rapide, il est possible d installer en l espace d une heure de nombreuses machines, automatiquement ou individuellement. Après avoir acquis un peu de pratique, il sera possible d utiliser Kickstart pour diverses raisons, par exemple pour installer de grands clusters ou pour rétablir le poste de travail de l ordinateur de la secrétaire, sans devoir payer des techniciens qui généralement réclament une compensation supplémentaire pour ces opérations. En cas de problème avec l utilisation de Kickstart, consultez la mailing list à l adresse Naturellement, il est également possible de se référer au support technique, mais avec Kickstart vous en aurez rarement besoin. useradd -c «Remote Support User» -d /home/ remotesupportuser -g root -s /bin/bash -u 499 remotesupportuser >/dev/null 2>&1 %postun userdel remotesupportuser > /dev/null 2> / dev/null %files %defattr(-,root,root) %attr(0775,remotesupportuser,root) / home/remotesupportuser/scriptlog %attr(0755,remotesupportuser,root) / home/remotesupportuser/.bash_profile %attr(0755,remotesupportuser,root) / home/remotesupportuser/.logrc %changelog Encadré 5 : firmeuser.spec Frederik Bijlsma est consultant pour Red Hat, Stuttgart 72

299 Chapitre 10 Synchronisation de fichiers Explications sur le pourquoi de ce chapitre Le partage de fichiers NFS fonctionne en réseau local. Si l on n est connecté via un réseau local, lorsque la connexion n est pas permanente (ordinateur portable nomade par exemple), on ne peut plus faire de partage de fichiers NFS. On fait de la synchronisation de fichiers. c T.Besançon (v ) Administration UNIX ARS Partie / Synchronisation de fichiers 10.1 Introduction Chapitre 10 Synchronisation de fichiers 10.1 Introduction Contexte : deux ou plusieurs machines qui ne peuvent pas partager de fichiers par NFS. Exemples : un ordinateur portable et une machine de bureau deux ordinateurs reliés par une liaison intermittente comme une liaison téléphonique PPP etc. La synchronisation manuelle est pénible à faire. Plusieurs logiciels automatisent la synchronisation. c T.Besançon (v ) Administration UNIX ARS Partie / 468

300 10 Synchronisation de fichiers 10.2 Synchronisation de fichiers UNIX via rdist Chapitre 10 Synchronisation de fichiers 10.2 Synchronisation de fichiers UNIX via rdist RDIST est un programme de distribution de fichiers sur des machines distantes, sur la base des dates de modification des fichiers source ou des fichiers distants. Programme obsolète! De plus en plus livrée en standard avec les différents UNIX mais lui préférer quand même la version disponible à l URL « (anciennement : «ftp://ftp.usc.edu/pub/rdist/») c T.Besançon (v ) Administration UNIX ARS Partie / Synchronisation de fichiers 10.3 Synchronisation de fichiers UNIX via rsync Chapitre 10 Synchronisation de fichiers 10.3 Synchronisation de fichiers UNIX via rsync (en anglais Remote Synchronisation) RSYNC est un protocole plus efficace que RDIST. Par exemple, on ne transfère que les différences entre fichiers et non pas la totalité du fichier. Si non installé en standard, cf « c T.Besançon (v ) Administration UNIX ARS Partie / 468

301 10 Synchronisation de fichiers 10.3 Synchronisation de fichiers UNIX via rsync Les modes de synchronisation les plus utilisés : d une machine distante à la machine locale : rsync machine-distante.example.com:source destination d une machine locale à la machine distante : rsync source machine-distante.example.com:destination d une machine locale à la même machine locale : rsync source destination c T.Besançon (v ) Administration UNIX ARS Partie / Synchronisation de fichiers 10.3 Synchronisation de fichiers UNIX via rsync Options utiles : option «-a» : mode archive, équivalent à toutes les options ci-dessous option «-r» : transfert récursif option «-p» : conservation des liens symboliques option «-p» : conservation des permissions option «-t» : conservation des dates option «-g» : conservation des GID option «-o» : conservation des UID option «-D» : conservation des devices option «-v» : mode verbeux, affichage des fichiers transmis option «-z» : transfert en mode compressé option «-e ssh» : transfert en utilisant SSH comme couche de transport c T.Besançon (v ) Administration UNIX ARS Partie / 468

302 10 Synchronisation de fichiers 10.3 Synchronisation de fichiers UNIX via rsync Les options courtes ont des équivalentes sous formes longues. (par exemple «--archive» = «-a», etc.) Autres options longues : option «--progress» : barre de transfert option «--delete-after» : suppression des fichiers en trop après la synchronisation option «--delete-before» : suppression des fichiers en trop avant la synchronisation c T.Besançon (v ) Administration UNIX ARS Partie / Synchronisation de fichiers 10.3 Synchronisation de fichiers UNIX via rsync ATTENTION : la syntaxe est subtile dans le cas de répertoire! Syntaxe de synchronisation d un objet répertoire «ananas» : rsync -a /source/ananas /var/tmp On copie l objet «ananas» vers la destination : on aura au final l arborescence «/var/tmp/ananas/». On crée donc un sous niveau d arborescence dans la destination. Syntaxe de synchronisation du contenu d un objet répertoire «ananas» : rsync -a /source/ananas/ /var/tmp On copie le contenu de l objet «ananas» vers la destination : on aura au final l arborescence «/var/tmp/...». On ne crée pas le sous niveau d arborescence dans la destination vu avec la syntaxe précédente. c T.Besançon (v ) Administration UNIX ARS Partie / 468

303 10 Synchronisation de fichiers 10.3 Synchronisation de fichiers UNIX via rsync Exemple Soit l arborescence «/tmp/exemple» à synchroniser : % ls -lr /tmp/exemple /tmp/exemple/: total 64 -rwxr-xr-x 1 besancon adm 211 Aug 26 17:53 fichier1* -rwxr-xr-x 1 besancon adm 211 Aug 26 17:53 fichier2* -rwxr-xr-x 1 besancon adm 211 Aug 26 17:53 fichier3* drwxr-xr-x 2 besancon adm 182 Aug 26 17:54 repertoire1/ /tmp/exemple/repertoire1: total 16 -rwxr-xr-x 1 besancon adm 211 Aug 26 17:53 fichier4* c T.Besançon (v ) Administration UNIX ARS Partie / Synchronisation de fichiers 10.3 Synchronisation de fichiers UNIX via rsync La synchronisation avec une autre machine appelée «remote.example.com» par la commande suivante : % rsync \ --stats \ --rsh=/usr/local/bin/ssh \ --rsync-path=/usr/local/bin/rsync \ --archive \ --compress \ --verbose \ --cvs-exclude \ \ /tmp/exemple \ [email protected]:/tmp c T.Besançon (v ) Administration UNIX ARS Partie / 468

304 10 Synchronisation de fichiers 10.3 Synchronisation de fichiers UNIX via rsync La commande précédente affiche : building file list... done exemple/ exemple/fichier1 exemple/fichier2 exemple/fichier3 exemple/repertoire1/ exemple/repertoire1/fichier4 Number of files: 6 Number of files transferred: 4 Total file size: 844 bytes Total transferred file size: 844 bytes Literal data: 844 bytes Matched data: 0 bytes File list size: 161 Total bytes written: 825 Total bytes read: 84 wrote 825 bytes read 84 bytes bytes/sec total size is 844 speedup is 0.93 c T.Besançon (v ) Administration UNIX ARS Partie / Synchronisation de fichiers 10.4 Synchronisation de fichiers WINDOWS Chapitre 10 Synchronisation de fichiers 10.4 Synchronisation de fichiers WINDOWS Nombreuses applications de synchronisation WINDOWS : FULLSYNC : « Synchronisation de fichiers entre disques locaux Synchronisation de fichiers entre disques locaux et réseau via FTP, SFTP, SMB Synchronisation unidirectionnelle sans effacement sur la destination Synchronisation unidirectionnelle exacte Synchronisation bidirectionnelle Programmation à la CRON ports de RSYNC pour WINDOWS MICROSOFT SyncToy 2.1 (purement microsoft?) c T.Besançon (v ) Administration UNIX ARS Partie / 468

305 Chapitre 11 Mécanisme d authentification réseau : /etc/passwd Explications sur le pourquoi de ce chapitre Mécanisme de base des comptes UNIX : «/etc/passwd» + «/etc/shadow». Ce sont des fichiers texte que l on peut recopier de machine en machine avec n importe quel outil de recopie de fichiers c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : /etc/passwd 11.1 Introduction Chapitre 11 Mécanisme d authentification réseau : /etc/passwd 11.1 Introduction Voir dans le volume 2 la partie sur «/etc/passwd» et «/etc/shadow». Problématique : comment gérer un réseau de machines qui n utiliseront que les fichiers «/etc/passwd» et «/etc/shadow»? c T.Besançon (v ) Administration UNIX ARS Partie / 468

306 11 Mécanisme d authentification réseau : /etc/passwd 11.2 Principe d une solution Chapitre 11 Mécanisme d authentification réseau : /etc/passwd 11.2 Principe d une solution Une solution : recopier depuis une machine A sur les autres machines les fichiers «/etc/passwd» et «/etc/shadow» recopie sur les autres machines via SCP (voir SSH) recopie périodique via CRONTAB n autoriser les changements de mot de passe que sur la machine A Quels sont les inconvénients de cette méthode? Quels sont les avantages de cette méthode? c T.Besançon (v ) Administration UNIX ARS Partie / 468

307 Chapitre 12 Mécanisme d authentification réseau : NIS Explications sur le pourquoi de ce chapitre NIS = première tentative (réussie) d un système centralisé de gestion de comptes UNIX c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.1 Introduction Chapitre 12 Mécanisme d authentification réseau : NIS 12.1 Introduction NIS = Network Information Service Créé par SUN en 1985 Anciennement Yellow Pages certaines commandes ont un nom en «yp...» Version NIS+ vers 1992, radicalement différente (cf annexe) C est un protocole réseau d accès à des informations centralisées sur un ou plusieurs serveurs redondants. Utilisation la plus courante : partager la base des comptes UNIX. c T.Besançon (v ) Administration UNIX ARS Partie / 468

308 12 Mécanisme d authentification réseau : NIS 12.2 Architecture de NIS Chapitre 12 Mécanisme d authentification réseau : NIS 12.2 Architecture de NIS Architecture construite en mode client / serveur : D A T A Maitre D A T A D A T A Esclave 1 Esclave 2 Mise a jour push / pull D A T A D A T A D A T A D A T A D A T A Client 1 Client 2 Client 3 Client 4 c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.2 Architecture de NIS Caractéristiques : Communications réseau via RPC (Remote Procedure Call) Propagation des données (maps) du serveur maitre aux serveurs esclaves en mode pull ou en mode push. Propagation des maps complètes Seul le serveur maitre peut modifier les données Les serveurs esclaves diffusent les données sans pouvoir les modifier c T.Besançon (v ) Administration UNIX ARS Partie / 468

309 12 Mécanisme d authentification réseau : NIS 12.3 Données NIS maps NIS, DBM, ypcat, ypmatch Chapitre 12 Mécanisme d authentification réseau : NIS 12.3 Données NIS : maps NIS, DBM, ypcat, ypmatch Les données manipulées par NIS : maps Les maps contiennent des couples (clef, valeur). Il n y a que le serveur NIS maître qui peut changer le contenu d une map. Une map est au format DBM (cf «man dbm») ; une map se compose de 3 fichiers : le fichier source le fichier de suffixe «.pag» le fichier de suffixe «.dir» c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.3 Données NIS : maps NIS, DBM, ypcat, ypmatch La commande «makedbm» permet de convertir le fichier source en les 2 fichiers constituant le DBM. % cat demo clef1 banane clef2 arbre % makedbm demo demo % ls -l demo -rw-r--r-- 1 besancon adm 23 Aug 15 11:56 demo -rw besancon adm 0 Aug 15 11:57 demo.dir -rw besancon adm 1024 Aug 15 11:57 demo.pag Dans le système NIS, les maps sont stockées sur le serveur maitre dans «/var/yp/nom-du-domaine-nis» : % cd /var/yp/nom-de-domaine-nis % ls -l passwd* -rw root other 4096 Nov 23 07:26 passwd.byname.dir -rw root other 8192 Nov 23 07:26 passwd.byname.pag -rw root other 4096 Nov 23 07:26 passwd.byuid.dir -rw root other 8192 Nov 23 07:26 passwd.byuid.pag c T.Besançon (v ) Administration UNIX ARS Partie / 468

310 12 Mécanisme d authentification réseau : NIS 12.3 Données NIS : maps NIS, DBM, ypcat, ypmatch Les maps sont construites automatiquement à partir de tous les fichiers sources des maps : passwd.byname passwd.byuid /etc/hosts makedbm NIS MASTER hosts.byname hosts.byuid /etc/passwd Le fichier «/var/yp/makefile» automatise toutes les créations de maps et leur propagation aux serveurs esclaves (mode push). c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.3 Données NIS : maps NIS, DBM, ypcat, ypmatch Extrait de «/var/yp/makefile» :... hosts.time: $(B) -l $(DIR)/hosts $(CHKPIPE)) \ (awk BEGIN { OFS="\t"; } $$1!~ /^#/ { print $$1, $$0 } $(CHKPIPE)) \ $(MAKEDBM) $(B) - "updated [! $(NOPUSH) ]; then $(YPPUSH) -d $(DOM) hosts.byname; [! $(NOPUSH) ]; then $(YPPUSH) -d $(DOM) hosts.byaddr; [! $(NOPUSH) ]; then echo "pushed hosts"; fi... c T.Besançon (v ) Administration UNIX ARS Partie / 468

311 12 Mécanisme d authentification réseau : NIS 12.3 Données NIS : maps NIS, DBM, ypcat, ypmatch La construction d une map se résume alors à (par exemple suite à une modification de /etc/hosts) : # vi /etc/hosts # cd /var/yp # make hosts updated hosts pushed hosts c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.3 Données NIS : maps NIS, DBM, ypcat, ypmatch La librairie DBM permet de créer des enregistrements de taille maximale 1024 octets : % man dbm SunOS/BSD Compatibility Library Functions dbm(3b) NAME dbm, dbminit, dbmclose, fetch, store, delete, firstkey, nextkey - data base subroutines The sum of the sizes of a key/content pair must not exceed the internal block size (currently 1024 bytes). Moreover all key/content pairs that hash together must fit on a single block. store will return an error in the event that a disk block fills with inseparable data. c T.Besançon (v ) Administration UNIX ARS Partie / 468

312 12 Mécanisme d authentification réseau : NIS 12.3 Données NIS : maps NIS, DBM, ypcat, ypmatch Quelques noms de maps : passwd.byname, passwd.byuid, group.byname, group.bygid, publickey.byname, hosts.byaddr, hosts.byname, mail.byaddr, mail.aliases, services.byname, services.byservicename, rpc.bynumber, rpc.byname, protocols.bynumber, protocols.byname, networks.byaddr, networks.byname, netmasks.bymask, netmasks.byaddr, ethers.byname, ethers.byaddr, bootparams, auto.master, auto.home, auto.direct, auto.src dont les plus utiles sont : map «passwd» map «group» map «hosts» map «netgroup» c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.3 Données NIS : maps NIS, DBM, ypcat, ypmatch La commande «ypcat» permet de consulter une map NIS depuis n importe quel client. Syntaxe : «ypcat map-nis» La commande «ypmatch» permet de consulter la valeur d une ou plusieurs clefs dans une certaine map NIS depuis n importe quel client. Syntaxe : «ypmatch clef1 clef2... map-nis» c T.Besançon (v ) Administration UNIX ARS Partie / 468

313 12 Mécanisme d authentification réseau : NIS 12.4 Client NIS, domainname, ypbind, ypwhich, ypset Chapitre 12 Mécanisme d authentification réseau : NIS 12.4 Client NIS, domainname, ypbind, ypwhich, ypset Un client NIS doit se connecter à un serveur NIS. C est l action de binding. Le binding nécessite : de fournir un nom de domaine NIS, le domainname ; une machine se déclare comme membre du groupe servi par les serveurs NIS de préciser la méthode de localisation du serveur NIS : broadcast ou explicite c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.4 Client NIS, domainname, ypbind, ypwhich, ypset Nom de domaine La commande activant le nom de domaine est domainname. Pour consulter le nom de domaine : «domainname» Pour configurer manuellement le nom de domaine : «domainname nom-du-domaine-nis» c T.Besançon (v ) Administration UNIX ARS Partie / 468

314 12 Mécanisme d authentification réseau : NIS 12.4 Client NIS, domainname, ypbind, ypwhich, ypset Configuration du domainname automatique au démarrage : Sur Solaris : renseigner le fichier «/etc/defaultdomain» Sur Linux : renseigner le variable «NISDOMAIN» du fichier «/etc/sysconfig/network» NETWORKING=yes FORWARD_IPV4=false HOSTNAME=pcars6.formation.jussieu.fr DOMAINNAME=formation.jussieu.fr GATEWAY= GATEWAYDEV=eth0 NISDOMAIN=real.world ATTENTION : sur LINUX, ne pas confondre avec la variable «DOMAINNAME» c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.4 Client NIS, domainname, ypbind, ypwhich, ypset Réalisation du binding Un client NIS fait tourner le démon «ypbind» qui se connecte à un serveur NIS que l on trouve selon 2 méthodes possibles : découverte par broadcast ; c est le mode par défaut. Sur Solaris, «/usr/lib/netsvc/yp/ypbind -broadcast» En pratique il y a une map «ypservers» qui contient les noms des serveurs. Cf «/var/yp/binding/nom-de-domaine-nis/ypservers» demande de connexion explicite Sur Solaris faire : # ypbind -ypsetme # ypset nom-du-serveur-nis-voulu La commande «ypwhich» affiche le nom du serveur NIS utilisé. c T.Besançon (v ) Administration UNIX ARS Partie / 468

315 12 Mécanisme d authentification réseau : NIS 12.4 Client NIS, domainname, ypbind, ypwhich, ypset On peut controler un peu quels sont les clients qui se bindent aux servers. Pour cela, remplir sur les serveurs esclaves et sur le serveur maitre le fichier «/var/yp/securenets». Il liste les machines autorisées, sous forme adresses et netmasks. Par exemple : Signification : seules les machines des réseaux « /16» et « /24» sont autorisées à se binder. c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.4 Client NIS, domainname, ypbind, ypwhich, ypset Consultation des maps Un client NIS doit indiquer quels maps il utilisera. La plus courante est la map «passwd» dont on indique l utilisation par l ajout d une ligne en fin de fichier «/etc/passwd» : +::65534:65534::: Signification de cette ligne supplémentaire (à vérifier sur chaque système car il existe des différences) : Tout champ renseigné de cette ligne + remplace le même champ de la map inconditionnellement sauf pour UID et GID. Pour UID et GID, les valeurs mentionnées s activeront si ces champs sont absents de la map (c est-à-dire quand la map est vérolée ce qui indique un problème de fichier source vérolé). c T.Besançon (v ) Administration UNIX ARS Partie / 468

316 12 Mécanisme d authentification réseau : NIS 12.4 Client NIS, domainname, ypbind, ypwhich, ypset Exemple : +:*LK*:65534:65534:::/usr/local/bin/tcsh Signification : le passwd chiffré des utilisateurs de la map passwd est «* LK*» l UID sera si l entrée de la map ne précise pas d UID le GID sera si l entrée de la map ne précise pas de GID le shell de login est mis automatiquement à «/usr/local/bin/tcsh» c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.5 Slave server NIS, ypserv, ypxfr Chapitre 12 Mécanisme d authentification réseau : NIS 12.5 Slave server NIS, ypserv, ypxfr Un serveur NIS esclave fait tourner plusieurs démons : ypserv ypbind Le démon «ypserv» est là pour répondre aux requêtes des client NIS qui se sont bindés sur lui. Le démon «ypbind» n est là que pour faire du serveur esclave un client NIS aussi (mais ce n est pas obligatoire). Il n est pas garanti que le serveur esclave soit client NIS de lui même. Il peut se binder sur un autre serveur NIS du même domaine. c T.Besançon (v ) Administration UNIX ARS Partie / 468

317 12 Mécanisme d authentification réseau : NIS 12.5 Slave server NIS, ypserv, ypxfr Un serveur esclave peut être down au moment où le serveur maitre fait un push des maps. besoin pour le serveur esclave de se resynchroniser avec le serveur maitre ; pull des maps de la part du serveur esclave Cela se fait au moyen de shell scripts lancés périodiquement via la crontab : 30 * * * * /usr/lib/netsvc/yp/ypxfr_1perhour 31 1,13 * * * /usr/lib/netsvc/yp/ypxfr_2perday 32 1 * * * /usr/lib/netsvc/yp/ypxfr_1perday Ces scripts récupérent plus ou moins de maps suivant la fréquence de leur lancement. c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.5 Slave server NIS, ypserv, ypxfr Exemple de l un de ces shell scripts, «ypxfr_1perhour» : #! /bin/sh # ypxfr_1perhour.sh - Do hourly NIS map check/updates PATH=/bin:/usr/bin:/usr/lib/netsvc/yp:$PATH export PATH ypxfr passwd.byname ypxfr passwd.byuid c T.Besançon (v ) Administration UNIX ARS Partie / 468

318 12 Mécanisme d authentification réseau : NIS 12.6 Master server NIS, ypxfrd, rpc.yppasswdd, yppasswd Chapitre 12 Mécanisme d authentification réseau : NIS 12.6 Master server NIS, ypxfrd, rpc.yppasswdd, yppasswd Un serveur NIS maître fait tourner plusieurs démons : ypserv ypbind ypxfrd rpc.yppasswdd Même rôle pour «ypserv» que pour un serveur esclave. Même rôle pour «ypbind» que pour un serveur esclave. Le démon «ypxfrd» assure les transferts de maps demandés par les serveurs esclaves (mode pull). (en UNIX, on rencontre souvent le mot xfr pour transfert) Le démon «rpc.yppasswdd» assure le changement des mots de passe. c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.6 Master server NIS, ypxfrd, rpc.yppasswdd, yppasswd Avec NIS, un client NIS ne peut pas modifier le contenu d une map. Pour changer un mot de passe, on va émuler le changement du mot de passe sur le serveur maitre dans son fichier source («/etc/passwd») puis la reconstruction de la map passwd et sa transmission en totalité aux serveurs esclaves. Ce processus se réalise en utilisant la commande «yppasswd» qui demande les mots de passe à l utilisateur puis appelle «rpc.yppasswdd» sur le serveur maitre qui simule la session interactive composée des commandes : # passwd # cd /var/yp # make passwd c T.Besançon (v ) Administration UNIX ARS Partie / 468

319 12 Mécanisme d authentification réseau : NIS 12.6 Master server NIS, ypxfrd, rpc.yppasswdd, yppasswd Sur un client NIS Linux : % yppasswd Changing NIS account information for besancon on linux.unixiens.org. Please enter old password: ******** Changing NIS password for besancon on linux.unixiens.org. Please enter new password: ******** Please retype new password: ******** The NIS password has been changed on linux.unixiens.org. Sur un serveur maitre NIS Solaris : % yppasswd Enter login(nis) password: ******** New password: ******** Re-enter new password: ******** NIS passwd/attributes changed on linux.unixiens.org c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.7 Netgroups Chapitre 12 Mécanisme d authentification réseau : NIS 12.7 Netgroups «/etc/netgroups» Le système NIS permet de définir des groupes d autorisation d accès : les netgroups. Ces groupes sont diffusés via la map netgroup. Un netgroup est un nom symbolique associé à un ensemble de triplets (je n ai jamais vu le troisième champ avoir une quelconque utilité en pratique) : nom-de-netgroup \ (machine, utilisateur, nom-de-domaine-nis) \ (machine, utilisateur, nom-de-domaine-nis) \... On définit en pratique des netgroups concernant des machines et des netgroups concernant des utilisateurs. On autorisera ainsi ou pas des groupes d utilisateurs ou de machines à accéder à certaines ressources. c T.Besançon (v ) Administration UNIX ARS Partie / 468

320 12 Mécanisme d authentification réseau : NIS 12.7 Netgroups Exemple de netgroup de machines : nains \ (atchoum.example.com,,mine-de-diamants) \ (dormeur.example.com,,mine-de-diamants) \ (joyeux.example.com,,mine-de-diamants) \ (grincheux.example.com,,mine-de-diamants) \ (prof.example.com,,mine-de-diamants) \ (timide.example.com,,mine-de-diamants) \ (simplet.example.com,,mine-de-diamants) Exemple de netgroup d utilisateurs : etudiants \ (,jean,) \ (,pierre,) \ (,valerie,) c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.7 Netgroups Exemple d utilisation d un netgroup d utilisateurs au niveau de «/etc/passwd» : field:password HERE:0:1:Field Service:/usr/field:/bin/csh operator:password HERE:5:28:Operator:/opr:/opr/opser sys:password HERE:2:3:Mr Kernel:/usr/sys: bin:password HERE:3:4:Mr Binary:/bin: pot:*:16:16:menupot:/users/staffs/pot: -@etudiants: +@net_administrateurs::0:0::: +@net_utilisateurs::65534:65534:::/bin/noshell Signification : On rejette les lignes de la map «passwd» dont le login est indiqué dans le netgroup «etudiants» On accepte les lignes de la map «passwd» dont le login est indiqué dans le netgroup «net_administrateurs» On accepte les lignes de la map «passwd» dont le login est indiqué dans le netgroup «net_utilisateurs» c T.Besançon (v ) Administration UNIX ARS Partie / 468

321 12 Mécanisme d authentification réseau : NIS 12.7 Netgroups Exemple d utilisation d un netgroup de machines au niveau de l exportation de disques via NFS (fichier «/etc/exports, cf chapitre sur NFS) : /usr/openwin -access=nains c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS 12.8 Installation de NIS Chapitre 12 Mécanisme d authentification réseau : NIS 12.8 Installation de NIS Master server Lancer «ypinit -m» Slave servers Lancer «ypinit -s serveur-maitre» Ajouter dans la crontab les appels aux scripts «ypxfr_*» Client NIS Spécifier le domainname c T.Besançon (v ) Administration UNIX ARS Partie / 468

322 Chapitre 13 Mécanisme d authentification réseau : NIS+ Explications sur le pourquoi de ce chapitre NIS+ = successeur de NIS mais n a pas connu de succès Vous trouverez ici juste un document sur NIS+ c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS Introduction / Conclusion Chapitre 13 Mécanisme d authentification réseau : NIS Introduction / Conclusion Cf annexe pour un document sur NIS+. Le système NIS+ n a pas connu de succès et il est maintenant officiellement abandonné au profit de LDAP par son principal défenseur, SUN. c T.Besançon (v ) Administration UNIX ARS Partie / 468

323 13 Mécanisme d authentification réseau : NIS Introduction / Conclusion Principaux reproches à NIS : pas d authentification du client aux serveurs NIS ; connaitre le domainname suffit à se binder les maps sont transmises en totalité même en cas de faible modification de leurs contenus inadaption du principe du domaine NIS dans le cas de structures WAN mode broadcast c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : NIS Annexe 1 Chapitre 13 Mécanisme d authentification réseau : NIS Annexe 1 Ci joint dans la version imprimée de ce cours, un document sur NIS+. c T.Besançon (v ) Administration UNIX ARS Partie / 468

324 Solaris ONC Network Information Service Plus (NIS+) A White Paper

325 1991 by Sun Microsystems, Inc. Printed in USA Garcia Avenue, Mountain View, California All rights reserved. No part of this work covered by copyright may be reproduced in any form or by any means graphic, electronic or mechanical, including photocopying, recording, taping, or storage in an information retrieval system without prior written permission of the copyright owner. Portions of this paper were previously published in the proceedings of Sun User Group, United Kingdom (SUG UK), The OPEN LOOK and the Sun Graphical User Interfaces were developed by Sun Microsystems, Inc. for its users and licensees. Sun acknowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface, which license also covers Sun s licensees. RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS (October 1988) and FAR (June 1987). The product described in this manual may be protected by one or more U.S. patents, foreign patents, and/or pending applications. TRADEMARKS Sun Microsystems, the Sun Logo, NFS, NeWS and SunLink are registered trademarks, and Sun, SunSoft, the SunSoft Logo, Solaris, SunOS, AnswerBook, Catalyst, CDWare, Copilot, DeskSet, Link Manager, Online: DiskSuite, ONC, OpenWindows, SHIELD, SunView, ToolTalk and XView are trademarks of Sun Microsystems, Inc., licensed to SunSoft, Inc., a Sun Microsystems company. SPARC is a registered trademark of SPARC International, Inc. SPARCstation is a trademark of SPARC International, Inc., licensed exclusively to Sun Microsystems, Inc. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX and OPEN LOOK are registered trademarks of UNIX System Laboratories, Inc. X Window System is a product of the Massachusetts Institute of Technology. All other products referred to in this document are identified by the trademarks of the companies who market those products. ii SunSoft

326 Table of Contents Executive Summary 1 Introduction 1 NIS+ Overview 2 NIS+ Features 5 Who Benefits? 6 NIS+ Architecture 7 Implementation of NIS+ Service 15 Glossary of Terms 20 Network Information Service Plus (NIS+) iii

327 iv SunSoft

328 Network Information Service Plus (NIS+) Chuck McManis, Saqib Jang Executive Summary The enormous growth in network computing over the past few years presents significant challenges for distributed system administrators, end users, and application developers. Distributed networks have become larger, typically consisting of interconnected subnetworks spanning multiple sites. Management of these networks, with tasks such as addition, relocation, and removal of network resources, including hosts, applications, and printers, has taken on added complexity. The growth in size of distributed systems presents complex requirements for applications and end users to transparently access resources across a network. To a large extent, such challenges can be addressed through the deployment of a robust, high-performance, network-accessible repository of distributed system resources commonly called an enterprise naming service. This paper provides an architectural overview of SunSoft s Network Information Service Plus (NIS+), an enterprise naming service for heterogeneous distributed systems. Introduction NIS+ replaces ONC Network Information Service (NIS) and is designed to address the management and resource location requirements for heterogeneous distributed systems of the 90s. It is a repository of user-friendly names and attributes of network resources such as hosts, applications, users, and mailboxes. Clients of NIS+, both applications and users, can efficiently look up information on network resources and access these resources in a locationindependent way. NIS+ also plays a critical role in the efficient operation and administration of distributed systems by acting as a central point for addition, removal, or relocation of resources. 1

329 The NIS+ naming service was designed to address the following goals: To prevent unauthorized access to network resources and act as the platform for distributed system security. To scale effectively from very small to very large networks, consisting of tens of thousands of systems. To provide the ability to easily administer from very small to large networks, spanning multiple sites. To provide for autonomous administration of subnetworks. To provide highly consistent information. NIS+ Overview NIS+ works with the ONC distributed computing platform, which is an integral part of the Solaris environment. Solaris 2.0 is comprised of SunOS 5.0, enhanced ONC, NIS+, OpenWindows V3, DeskSet V3, and OPEN LOOK. A network computing standard that enables Solaris-based systems to connect to any other proprietary or standard system, ONC permits users to access information throughout the network and make use of all the computing power in the user organization, regardless of location. ONC is supported on all major hardware platforms, from PCs to mainframes. With an installed base of over 1.3 million systems, ONC is unrivalled as the industry standard for distributed computing in heterogeneous networks. NIS+ is a client/server application built atop the ONC Transport-Independent Remote Procedure Call (TI-RPC) interface. RPC applications are also clients of the NIS+ name service. Since the RPC client and server components of an application may be located on arbitrary machines in a large network, RPC clients use the name service to locate and bind to RPC servers in a flexible and high-performance way. NIS+ Enhancements NIS+ includes enhancements which fall into three general categories: the structure of the global namespace 1, the structure of data within maps, and the authentication and authorization models associated with the namespace and the data that it contains. 1. Refer to Glossary of Terms at the end of this paper for definition of namespace and other terms. 2 SunSoft

330 First, NIS+ includes support for hierarchical domain names, which is a better model for a namespace of many smaller domains. This model has been very successfully used by the Domain Name System that is currently deployed on the Internet and brings other benefits to the system as well. The most important benefit is that hierarchical domains provide a structure upon which to hang a distributed authority mechanism. Other advantages of hierarchical names are that given some basic information, there are well-known mechanisms for locating nodes within a tree, and a mechanism for generating globally unique names that can be based on using the domain name as part of the name. Secondly, NIS+ includes a new database model, which consists of two parts. The first part is a record containing the schema for the database that is stored in the namespace. The second part is the database itself, which is managed by the same server that serves the portion of the namespace where the schema record was found. To keep the databases simple and the representation of the schema manageable, only two properties of the database were chosen to be part of the schema. These properties are the number of columns in a database, and an indication for each column as to whether or not it should be used as one of the indexes for the database records. Additionally, for columns that were determined to be searchable, a flag is present to specify whether or not the case of the characters should be considered when searching the database with this index. Using this model, NIS dbm databases are described as twocolumn databases, with the first column searchable and case sensitive. Figure 1 shows a graphical representation of how these hierarchical domains and databases, called tables, are related in a global namespace. Network Information Service Plus (NIS+) 3

331 Replica Server Master Server Replica Server Number of Columns : n Searchable Column : 0,1 Case Sensitive Columns : 0 Col 0 Col 1 Col 2... Col n Replica Server Table Table Figure 1 Graphical Representation of the New NIS+ Namespace Thirdly, there are two classes of changes that have been made to the NIS authentication and authorization model. One is the ability to authenticate access to the service and thus discriminate between accesses that are allowed to members of the community versus other network entities. The other is an authorization model to allow specific rights to be granted or denied based on this authentication. Authentication is provided by mechanisms available to all users of ONC RPC. This allows the server to ensure the identity of principals making requests and clients to ensure the identity of the server answering the request. A second change is the need for the information to have an authorization model associated with it. This is accomplished by using an authorization model similar to the UNIX file system model. This model specifies that each item in the namespace has a set of access rights associated with it, and that these rights are granted to three broad classes of principals: the owner of the item, a group owner of the item, and all other principals. The specific access rights are different than the traditional read, write, and execute rights that the file system grants owing to the nature of information 4 SunSoft

332 services. Also, it is useful to include a class of clients which are not authenticated, the nobody principal. The authentication and authorization mechanisms incur a finite performance penalty when they are in effect. By allowing the client to specify access rights to unauthenticated principals, we allow them to choose better performance at the cost of weaker security. Finally, access rights to individual rows and columns within the database are also provided. This provides a desirable property, that we can maintain the privacy of particular fields in a record, such as the encrypted password, while still giving unrestricted access to the other fields of database records. NIS+ Features The highlights of the NIS+ naming service s capabilities are: Support of hierarchical namespace enables simplified distributed management. The namespace can be divided into domains with each domain managed on an autonomous basis. Comprehensive and flexible security mechanisms prevent unauthorized access to network resources. Authentication facilities allow verification of a client s identity, while authorization is provided to ensure the client has rights to perform desired name service operation. Partitioning of name service information provides improved scalability. The unit of partitioning is the directory which contains object information for each namespace domain. Replication, on a per-directory basis, allows high availability/reliability. Each domain has a master directory to which all updates are applied and a number of slave replicated directories. Fast consistency between directory replicas enables support of rapidly changing network environments. Complete and consistent suite of administrative, namespace and information base functions with programmatic interfaces enables simplified access to and management of (subject to security constraints) name service. Administrative operations can be performed remotely and flexibly. Namespace and information base operations allow applications and users to efficiently access and modify information on network objects. Compatibility to allow NIS-based applications and environments to migrate smoothly to NIS+. Network Information Service Plus (NIS+) 5

333 Who Benefits? System Administrators NIS+ is a powerful tool for simplified administration of heterogeneous distributed systems. As the size of such systems grows and the requirements for decentralized administration emerge, a multidomain hierarchical namespace can be created. Assignment of names and modification of information on network resources within each domain can be decentralized. Further, the hierarchy can correspond to organizational (with each domain representing a functional group, for example, Engineering or Marketing) or logical hierarchies, allowing administrators to use intuitive schemes for organizing their namespace. Comprehensive security schemes benefit administrators by ensuring that the name service is protected from unauthorized access. This is critical since the name service functions as the primary means for administrators to add, remove, or modify network resources. The security functionality is flexible as administrative groups can ensure that name service information under their control is protected from access and modification by principals outside of that organization. Partitioning allows administrators to continue using NIS+ as the platform for distributed management as networks grow in size. As multiple domains are created, the overall namespace can have unbounded growth, yet the size of an individual directory remains within bounds. Replication of individual directories, on a master/slave basis, allows coping with computer and communication link failures. In addition, NIS+ provides for fast transfer of updates from master to slave servers, allowing authorized administrators to rapidly add, remove, or relocate network resources. NIS+ also includes a set of functions for flexible and easy administration of the name service itself. This includes functions for starting and stopping NIS+ servers, replicating and partitioning operations, and setting security levels. NIS+ functions support a corresponding Application Programming Interface (API) designed to allow NIS+ to be the platform for next-generation administrative applications from SunSoft and Independent Software Vendors (ISVs). 6 SunSoft

334 Application Developers Application developers can use the NIS+ API in a number of ways. First, applications can use the API to lookup information on network resources. The SunOS 5.0 system uses NIS+ as the repository for storage of information on hosts, passwords, and users for administrative purposes. Applications can use the NIS+ API to access this information in a high-performance fashion. Second, NIS+ can be used as a secure repository of network-accessible applicationspecific data. Improved consistency and the read/write capability within NIS+ API enable the service to be used to store and modify application-specific information. For example, the OpenWindows V3 Calendar Manager uses NIS+ to store group schedule information that authorized group members can access and modify to schedule meetings. Finally, new administrative applications can be developed that run atop the API and take advantage of its simplicity and consistency in providing access to network resources. NIS+ will serve as the platform for future administrative applications from SunSoft. NIS+ also provides full compatibility with the NIS (i.e., yp_xxx) programming interface to allow ease of application migration. End Users NIS+ Architecture NIS+ also provides significant advantages for end users. The security functionality within NIS+ enables users to trust network communications and to protect sensitive information from unauthorized access. Users productivity increases as applications transparently locate resources by accessing the central data storage facility with NIS+. The simplicity and completeness of the NIS+ interface enables users to take on a greater proportion of administrative tasks, which is particularly advantageous where administration resources can t keep pace with constantly growing distributed systems. NIS+ provides two types of services to clients. The first is a name service that maps names, such as domain names, to their respective servers. The second is a directory service where the desired information itself is returned, rather than a pointer to it, such as the UNIX password record. Network Information Service Plus (NIS+) 7

335 NIS+ Naming Model The naming model used by NIS+ is a graph structured as a singly rooted tree. Within this graph each vertex represents one NIS+ object, each of which may have several children associated with it but only one parent. There are six types of NIS+ objects defined: directory, table, group, link, entry, and private. Directory objects identify a database of NIS+ objects. Objects within that database are represented as children of the directory object. A directory object and all of its children is an NIS+ domain. An NIS+ directory that is a child of another NIS+ directory is a subdomain, and all domains below the root directory are the NIS+ namespace. An NIS+ object name consists of several labels, each separated from the next by a dot (. ) character. The rightmost label is closest to the root of the namespace. Labels that contain the dot character are quoted. These names are shown graphically in Figure 2. Names that end with the dot character are said to be fully qualified, whereas names which do not are said to be partially qualified. Names that identify objects in the namespace are called NIS+ regular names. mumble baz fred foo bar bob.smith bob.smith.fred.mumble. foo.bar.baz.mumble. Figure 2 Construction of a Multipart NIS+ Name 8 SunSoft

336 NIS+ Database Model Table objects in the NIS+ namespace identify databases called tables. These databases are called tables because the model used is that of a columnar table. The object contains the schema for the database it identifies. This schema specifies the number of columns in the table and identifies which columns can be searched with an NIS+ query. Rows within an NIS+ table are identified by a compound name syntax called indexed names. These names contain a search criterion and a regular name. The regular name portion identifies the NIS+ table to search. The search criterion identifies the rows of interest by specifying what value one or more searchable columns must contain to satisfy the search. The search criterion consists of an open bracket [ followed by zero or more attributes of the form column_name = column_value followed by a close bracket character ]. Attributes in the search criterion are separated by commas, and the search criterion and the name of the table to apply it to are also separated by commas. An example of these compound names is shown in Figure 3. In the first example, the name [manager=susan],employees.widget.com. selects only the fourth row which satisfies this search. In the second example, the name [manager=george],employees.widget.com. selects the first three rows which all satisfy the search criterion. Finally the name [manager=george, name=bob],employees.widget.com. would return only the first row, because only that row satisfies the complete criterion. The null search criterion, [ ], selects all of the rows. Table name: employees.widget.com { name manager department division } bob george engineering electronics mary george engineering electronics jane george engineering electronics sam susan sales electronics [manager=susan], employees.widget.com. [manager=george],employees.widget.com Figure 3 Indexed Names Selecting Entries From a Table Network Information Service Plus (NIS+) 9

337 The set of names that NIS+ will accept can be defined by the grammar shown in Figure 4. This grammar defines the terminal characters dot (.), comma (,), open bracket ([), close bracket (]), and equals (=). Generally, it is inadvisable to put terminal characters in the strings that make up the NIS+ names. However, should that be necessary, a quoting mechanism based on the double quote ( ) character is provided. All characters between two double quote characters are not scanned. The double quote may itself be quoted by placing two double quote characters adjacent to each other. These two characters are treated as a single instance of the double quote character. NAME REGULAR_NAME INDEXED_NAME ::= <REGULAR NAME> <INDEXED_NAME> ::=. <STRING>. <STRING>. <REGULAR_NAME> ::= <SEARCH_CRITERION>, <REGULAR_NAME> SEARCH_CRITERION ::= [ <ATTRIBUTE-LIST> ] ATTRIBUTE_LIST ATTRIBUTE STRING ::= <ATTRIBUTE> <ATTRIBUTE>, <ATTRIBUTE -LIST> ::= <STRING> = <STRING> ::= see note Note: ISO Latin 1 character set, the initial character in the string may not be or - ; embedded terminal characters must be protected by double quotes. Figure 4 NIS+ Name Grammar In addition to the schema, the table object holds other properties associated with the database. These include the tables type, a concatenation path, and a separator character that clients printing entries from the table use to separate the data from each column in the output. All of these properties are shown graphically in Figure SunSoft

338 Table: foe.bar. Type: foo records Path:foo.bar. :fie.bar. Table: fie.bar. Type: Entry foo records Column Path:foo.bar. Column :foe.bar.... Data 1 2 Table: foo.bar. Entry Row 0 Column Column Column Type: Data foo records 1 Path:fie.bar :foe.bar. n Row 1 Entry Row 0 Column Column Column Data Row n Row 0... Row m Row 1 Row m... Row m Column n Search Path Starting from foo.bar Figure 5 Graphical Representation of an NIS+ Table The type property is a string that the server uses to prevent adding to a table entries that are not compatible with the other entries in the table. This string is compared to the type string of the entry being added and, if they do not match, the update is rejected. The server also compares the number of columns in the entry being added to the number of columns in the table, as well as their type (binary or text) to minimize mistakes that could corrupt the database. The concatenation path, or search path, is a string containing the list of tables to search if the search criterion in the indexed name doesn t match any entries in the table. This search path allows all of the tables identified within it to appear to the client as one large table. When entries from several tables in the path all match the same search criteria, the first ones found are the ones returned to the client. In addition to the column data, each row in a table contains some specific information about itself in the column labelled Entry Data. This entry data consists of the name of the owner for this row, a group owner, a time-to-live value for the row, and a set of access rights for this row. The entry data is Network Information Service Plus (NIS+) 11

339 combined with the data from the columns when a row is returned in the form of an entry object. This data is present for all rows, but is neither searchable nor explicitly identified as a column in the table object. NIS+ Object Model NIS+ objects are implemented as variant records. All objects share a common set of properties which include an owner name, group owner name, access rights, object identifier, and time-to-live values. Additionally, each type of object has a variant part that contains specific information for that type of object. Directory objects contain the names, addresses, and authentication information for machines that serve a domain. Table objects contain the schema for NIS+ tables. Group objects contain a list of principals that are members of that group. Link objects contain the name of another object, and entry objects contain data from an NIS+ table. A diagram of these objects is shown in Figure 6. The shaded portion represents properties that are unique to a particular type of object. The top portion show properties that are common to all objects. DIRECTORY TABLE GROUP LINK ENTRY iod name domain owner group access rights time to live directory name table type group flags link name entry s type directory type number of columns group members(s) link type entry data servers(s) column description(s) member 0 datum 0 master column 0 member 0 datum 1 replica(s) column Common Properties Variant Properties Figure 6 Format of NIS+ Objects 12 SunSoft

340 NIS+ Authorization Model The NIS+ authorization model has four different access rights that can be granted to four classes of NIS+ principal. The four classes of principal are: the owner of an object; the group owner, which is a set of principals that is specified by a group object; the set of authenticated principals that are known to the NIS service, called collectively the world; and the set of unauthenticated principals called collectively the nobody principal. The four access rights that are grantable are read, modify, create, and destroy. Read conveys the right to read the contents of objects, directory objects, and table objects. Modify conveys the right to change the attributes of an object such as the owner, group owner, time-to-live, and so forth. For directory and table objects, create conveys the right to add objects to the namespace controlled by that object, either a domain or a database, depending on the object s type. Destroy conveys the right to remove objects from directories or tables. Domains and tables have additional access rights masks that allow rights to be granted with a finer degree of granularity. The NIS+ service uses authentication information extracted from the RPC messages it receives to identify the NIS+ principal making the request. ONC/RPC messages have attached to them authentication information in the form of an authentication flavor. Each flavor of authentication contains a unique principal name. The service uses this name and the authentication type to search an NIS+ table named cred.org_dir.<domain>. In that table, there is an entry that contains the flavor-specific name and the NIS+ principal name. This NIS+ defined name is used in the authorization mechanism rather than the authentication flavor-specific name. This mapping allows the NIS+ service to accept different flavors of authentication information and continue to work correctly. The authorization model for tables is somewhat more sophisticated than that of the namespace. Access rights in tables can be granted on a per table, per entry, and per column basis. Each level supersedes all subsequent levels. Thus, if read access is allowed for the table, it is allowed for every row and every column in the table. If read access is not allowed for the table but is allowed for a particular row, then it is allowed for all columns in that row. Finally, if read access is not allowed for the table and not allowed for the row but is granted for a particular column, then that column s value is returned. This is shown graphically in Figure 7. Each mask serves to narrow the authorization to a finer grain. In the figure, the entire table is readable only by the owner of the table. Network Information Service Plus (NIS+) 13

341 However, all of Row 1 is readable by the group owner of that row. Finally, Column 1 of Row 1 is readable by all NIS+ principals. The modify right may be similarly granted. Table: foo.bar. Type: foo records Path:fie.bar. :foe.bar. Row 0 Row 1... Row m Entry Data Column 1 Column 2... Column n Table Object access rights: r----/----/---/---- Access rights shown for: owner/group/world/nobody Row m Entry data: -----/----/r--/---- Column n access rights: -----/----/r--/---- DATUM Figure 7 Access Rights for Tables Showing Narrowing of Access Rights The service enforces read access by censoring the information that the requesting principal is not allowed to read. When one or more, but not all, columns of an entry are protected, the protected columns have the sensitive data replaced by the *NP* string. Entries that are completely protected are simply removed from the list of entries that are returned. If no access to the table is allowed, an error of NIS_PERMISSION is returned. 14 SunSoft

342 NIS+ Replication Model Implementation of NIS+ Service Like the existing NIS service, NIS+ domains are replicated using a master and slave model. However, the replication is done on a per domain basis with each internal node in the hierarchy having its own master and replicas. Further, all changes to either a domain or a table within that domain are logged on an event basis, and it is these change events that are propagated to replicas rather than to complete databases. The components of the NIS+ design which implement these models are shown in Figure 8. The highlighted components are currently part of the SunOS system and are not considered to be part of this design. In Figure 8, each box represents a set of interfaces. Dependencies are not strictly linear since the location mechanism may be required to lookup directory objects within the NIS+ namespace, and this causes it to make use of the NIS+ client library interfaces. Application / System API NIS+ Client Library RPC Authentication Mechanism NIS+ Location Mechanism Cache Manager ONC/RPC NIS+ Service Daemon Database Storage/Retrieval Mechanism Figure 8 The Components of the NIS+ Service Network Information Service Plus (NIS+) 15

343 NIS+ Client Library The NIS+ client library interface provides the application interface to the NIS+ service. Generally, applications have their own naming interface that meets their particular requirements. For example the operating system defines a naming interface gethostbyname( ), which takes a host name and returns a structure containing addressing information about that machine. This additional level of indirection allows different naming services to be used in different implementations of the interface. The NIS+ client library is designed to provide a set of capabilities upon which custom naming interfaces may be built. The two primary operations that are provided are the lookup and list operations. The lookup operation locates objects in the namespace and the list operation searches tables. Both provide the service binding and name resolution functions. Additionally, they provide services specific to their operation. The lookup routine implements the symbolic link facility. When a lookup request is made and the object returned is a link, the function checks to see if a flag was passed that specified links were to be followed. If it was, the function restarts the lookup using the name stored within the link. Similarly, the list function implements the table search paths. When a search of an NIS+ table returns no entries, and the user has specified that the table s path is to be followed, this function begins searching each table in the path until a match is found or the path is exhausted. The client library also provides the service location mechanism. All NIS+ clients have a file in their file system that contains the name, address, and authentication information for at least one trusted machine. Requests for service go to that machine or machines first when the client boots up. When the desired name is not within the local domain, the machine serving the domain is queried for a server that does serve the desired domain. This query will be answered by either the name of a server who serves that domain or a server that is closer to that domain. The location function will cache this information with a local daemon named the NIS+ cache manager. If the returned information is not the desired domain, the domain returned is queried for the name of a machine serving the desired domain. In this way, all nodes in the namespace can be located. 16 SunSoft

344 The client library also provides the interfaces to add, remove, or modify NIS+ objects and entries in NIS+ tables. Various routines for manipulating NIS+ names, routines for returning the NIS+ names associated with the current process, and various object handling functions, round out the package. NIS+ Cache Manager The NIS+ location mechanism is predicated on the fact that a machine that serves a domain is itself a client of a domain above the one it serves. When a location request comes from a client for a domain, the service checks to see if it serves the requested domain. If it does not serve the domain, then it checks to see if the desired domain is below the domain it serves in the namespace. If the desired domain is below its domain, it locates the directory object in its domain that is an ancestor of the desired domain and returns that to the client. If the desired domain is located above the server s domain, it returns the directory object for its domain, which is always above the one it serves. Eventually this mechanism will find its way to the root of the namespace and begin to work its way down again. Once the desired directory object is found, the service returns it to the client. Prior to returning the directory object, it signs it with a cryptographic checksum, using a key that only a root process on the client machine will know. When the client library gets the directory object, it passes it to the cache manager which verifies the signature and adds it to its shared memory cache of directories. Future client requests will find the directory object in the cache manager s cache and avoid the time-consuming location step. NIS+ Service Daemon Once the directory object for a domain is located, the client library will attempt to bind to each of the servers within it until a binding is successful. Update operations will attempt to bind only to the master server. Given a binding, the client library then makes an RPC call to the NIS+ service. The service uses a virtual memory database to store its information. It receives the request, checks the access rights, and then responds with the requested information. The database used is linked to the service at runtime using the shared library facilities of the SunOS system. The use of shared libraries allows the database to be replaced with specialized versions for different applications. Network Information Service Plus (NIS+) 17

345 An In-depth Look at an NIS+ Operation A diagram showing the execution of an NIS+ lookup is provided in Figure 9. An NIS+ operation begins with a call to an application or system library interface (1) that uses NIS+ in its implementation. Most often, this operation will be a search of an NIS+ table. In our example, the gethostbyname function call searches the hosts table. The gethostbyname function then calls the NIS+ client library in Step 2. The client library looks in the cache manager s shared memory location cache for the name and address of an appropriate machine. If there are no appropriate bindings, the client library will locate one using Steps 2.1 and 2.2 to call the nis_finddirectory function in search of a server. When a machine is located, the client library issues an RPC request, Step 3, to talk to the service. The service consults its database and returns the results in Step 4. The client library takes the results and passes them up to the application interface in Step 5, and the gethostbyname call converts the data in the entry object into a struct hostent for the calling program. 18 SunSoft

346 hostent *gethostbyname( pepper ); 6 1 Local library API Cache Manager Legend: Shared Memory Interface Shared Library Interface Remote Procedure Call Cache Message Protocol Shared Memory Mapping NIS+ Client Library 3 Cache Library 2.2 NIS+ find_directory() NIS+ Service Daemon Structured Storage Manager NIS+ Service Daemon Structured Storage Manager Occurs on Cache Miss Figure 9 Execution of an NIS+ Lookup Network Information Service Plus (NIS+) 19

347 Glossary of Terms Throughout this document, terms are used that have specific meanings. These terms are collected here for easy reference. API Binding Credential Datum dbm DNS Domain Global Namespace Information Base The Application Programming Interface, or API, consists of a series of C routines that give application programs access to the naming service. An NIS+ name, protocol information and valid credential for the local NIS+ server. This information is sufficient to contact the server and resolve names administered by it. A piece of information that identifies the user or host providing it. If the information is encrypted in such a way that only its true owner, as opposed to an imposter, could encrypt it, then it is a Secure Credential. The dbm structure that NIS uses: it consists of an opaque key and an opaque data array. A collection of C language routines that implement a very simple database. The database is divided into maps. The Domain Name Service: the Internet name service that is defined in RFC 1034 and RFC A namespace administered by a single authority. The domain contains one or more servers that have its directory. The space of all possible names that NIS+ can name. This space starts at dot (.) and goes down. A generic term for a collection of information that is named. 20 SunSoft

348 Internet Leaf Node Namespace Object Object Value Principal A network whose participants are machines and users from more than one organization. Additionally, those organizations do not share any common management at any level. A leaf node or name has no other names below it in the namespace tree. A namespace is a collection of names administered by a single authority. This space can be composed of several branches of the Organizational Namespace. An NIS+ object is the basic unit in the namespace. This data structure consists of an administrative part that identifies ownership, access rights, and so forth, and a data part that contains the value of the object. The data portion of an NIS+ object. Clients of the naming service, such as users and hosts, having credentials. Network Information Service Plus (NIS+) 21

349 22 SunSoft

350 SunSoft, Inc Garcia Avenue Mountain View, CA For more information, call Printed in USA 9/ K

351 Chapitre 14 Mécanisme d authentification réseau : LDAP Explications sur le pourquoi de ce chapitre Précurseur dans le domaine des annuaires. Maintenant à la traine par rapport à la technologie Microsoft Active Directory. Globalement mal intégré aux systèmes UNIX/LINUX. c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.1 Problématique Chapitre 14 Mécanisme d authentification réseau : LDAP 14.1 Problématique Cas de l université de Paris 4 : base Microsoft Excel du personnel administratif base Microsoft Access du personnel enseignant base «/etc/passwd» des comptes des utilisateurs base mysql des 2 catégories de personnel prochainement logiciel à base d Oracle prochainement Microsoft Active Directory Question 1 : envoyer un à tous les personnels administratifs sachant que le service du personnel ne fournira qu une liste avec nom et prénom. Comment l ingénieur système fait-il? Question 2 : envoyer un à tous les personnels administratifs sauf ceux du site de Clignancourt, sachant que le service du personnel ne peut pas fournir de liste cette fois-ci. Comment l ingénieur système fait-il? c T.Besançon (v ) Administration UNIX ARS Partie / 468

352 14 Mécanisme d authentification réseau : LDAP 14.2 Principe d annuaire Chapitre 14 Mécanisme d authentification réseau : LDAP 14.2 Principe d annuaire Un annuaire informatique est un service permettant d accéder à des informations, relatives à des personnes ou à diverses ressources de façon organisée. Objectif : maintenir de façon cohérente et contrôlée les archipels de données et obtenir des données de référence. c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.2 Principe d annuaire Un annuaire n est pas une base de données relationnelles. Une base de données (SGBD) se caractérise par : Le schéma des données est défini à 100% pour résoudre un certain problème. Les applications connaissent explicitement le schéma des données. Les objets sont complexes et éclatés entre plusieurs tables liées par des relations complexes. Un SGBD supporte les transactions. Un SGBD supporte un langage comme SQL qui permet des fonctions d interrogation et de mises à jour très complexes. Un SGBD centralise les données pour éviter les problèmes de synchronisation de données et de qualité des temps de réponse. c T.Besançon (v ) Administration UNIX ARS Partie / 468

353 14 Mécanisme d authentification réseau : LDAP 14.2 Principe d annuaire Un annuaire se caractérise par : Les objets sont indépendants (pas de liens de dépendance entre eux). Les objets peuvent être distribué sur plusieurs annuaires pour assurer une meilleure disponibilité. Le schéma est standardisé pour pouvoir partager les données. Le schéma est extensible pour prendre en compte tous les besoins mais cela est fait de façon compatible avec les standards. Les applications d annuaire ignorent la structure interne des données. Un annuaire est principalement consulté en lecture et est optimisé pour cela. c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.3 Annuaire LDAP Chapitre 14 Mécanisme d authentification réseau : LDAP 14.3 Annuaire LDAP LDAP Lightweight Directory Access Protocol Héritier de l annuaire ISO X500. Version 3 actuellement. RFC 2251 à 2256, RFC 2829 à 2830, RFC Il n y a pas de standard de représentation des contrôles d accès aux données. LDAP : nom d un protocole nom d une structure de données nom d implémentations de serveurs suivant le protocole Confusion possible... c T.Besançon (v ) Administration UNIX ARS Partie / 468

354 14 Mécanisme d authentification réseau : LDAP 14.4 Modèle de données de LDAP : DIT, suffixe Chapitre 14 Mécanisme d authentification réseau : LDAP 14.4 Modèle de données de LDAP : DIT, suffixe Les entrées sont organisées sous forme d arbre ou DIT (Directory Information Tree). L une des difficultés de LDAP : construire l organisation du DIT. De quoi est-il le reflet? : DIT à caractère organisationnel? DIT à caractère géographique? Pas de solution universelle. c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.4 Modèle de données de LDAP : DIT, suffixe DIT à caractère organisationnel? dc=company,dc=com dc=recherche dc=finance dc=marketing dc=people dc=people dc=people dc=groups dc=groups dc=groups c T.Besançon (v ) Administration UNIX ARS Partie / 468

355 14 Mécanisme d authentification réseau : LDAP 14.4 Modèle de données de LDAP : DIT, suffixe DIT à caractère géographique? dc=company,dc=com dc=america dc=europe dc=asia dc=people dc=people dc=people dc=groups dc=groups dc=groups c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.4 Modèle de données de LDAP : DIT, suffixe La racine de l arbre est uniquement conceptuelle et n existe pas réellement. C est le suffixe qui sert à déterminer les adresses absolues des objets (comme «/» pour l arborescence des fichiers UNIX). dc=company,dc=com SUFFIXE dc=recherche dc=finance dc=marketing dc=people dc=people dc=people dc=groups dc=groups dc=groups c T.Besançon (v ) Administration UNIX ARS Partie / 468

356 14 Mécanisme d authentification réseau : LDAP 14.4 Modèle de données de LDAP : DIT, suffixe Le suffixe peut avoir plusieurs formes : forme 1 : «o=company.com» forme 2 : «o=company,c=com» forme 3 : «dc=company,dc=com» On préférera la 3ième forme. Explication : donnée dans RFC On sait que les noms de domaine sont garantis uniques. Comme les annuaires peuvent être répartis, avec ce principe d unicité des noms de domaine DNS, on garantit l unicité des contextes de nommage LDAP. c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.4 Modèle de données de LDAP : DIT, suffixe Exemple de DIT visualisé avec LdapBrowser disponible à l URL : c T.Besançon (v ) Administration UNIX ARS Partie / 468

357 14 Mécanisme d authentification réseau : LDAP 14.5 Modèle de données de : entrée, attributs, DN, URL Chapitre 14 Mécanisme d authentification réseau : LDAP 14.5 Modèle de données de LDAP : entrée, attributs, DN, URL DSE Directory Service Entry Les entrées dans le DIT (DSE) sont des agrégats d attributs monovalués ou multivalués qui permettent de stocker n importe quel format de données (prénom, numéro de téléphone, image, son, etc.) Les DSE sont stockées dans le DIT et arrangés selon leur identifiant unique, le DN (Distinguished Name). Un DN est la concaténation d un RDN (Relative DN) et du DN des parents. Un DN s apparente à une clef primaire. suffixe : dc=company,dc=com RDN : ou=recherche DN : ou=recherche,dc=company,dc=com RDN : uid=besancon DN : uid=besancon,ou=recherche,dc=company,dc=com (le RDN doit être un des attributs/valeurs du DSE) c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.5 Modèle de données de LDAP : entrée, attributs, DN, URL c T.Besançon (v ) Administration UNIX ARS Partie / 468

358 14 Mécanisme d authentification réseau : LDAP 14.5 Modèle de données de LDAP : entrée, attributs, DN, URL Il existe des URL LDAP (RC 2255) qui prennent la forme : ldap://serveur:389/dn Par exemple dans communicator de netscape : c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.6 Modèle de données de LDAP : schéma, syntaxes, OID, objectclass Chapitre 14 Mécanisme d authentification réseau : LDAP 14.6 Modèle de données de LDAP : schéma, syntaxes, OID, objectclass Le schéma du DIT regroupe les définitions relatives aux types d objets que peut contenir l annuaire ou que l on peut rechercher. Le schéma contiendra des objets instanciations de classes LDAP, les définitions de ces classes et de leurs attributs, les syntaxes de ces attributs. Tous ces éléments seront identifiés par des Object Identifiers dits OID. attributetype ( NAME uidnumber DESC An integer uniquely identifying a user in a domain EQUALITY integermatch SYNTAX SINGLE-VALUE ) objectclass ( NAME posixaccount SUP top AUXILIARY DESC Abstraction of an account with POSIX attributes MUST ( cn $ uid $ uidnumber $ gidnumber $ homedirectory ) MAY ( userpassword $ loginshell $ gecos $ description ) ) ( et sont des OIDS) c T.Besançon (v ) Administration UNIX ARS Partie / 468

359 14 Mécanisme d authentification réseau : LDAP 14.6 Modèle de données de LDAP : schéma, syntaxes, OID, objectclass Une syntaxe est un modèle de représentation des valeurs de l attribut. Par exemple booléen, entier, binaire (pour une image, un son), etc. L attribut objectclass spécifie la liste des classes qu instancie un DSE. Chaque classe va construire la structure du DSE en spécifiant une liste d attributs obligatoirement présents («MUST» dans l objectclass) et une liste d attributs facultatifs («MAY» dans l objectclass). c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.6 Modèle de données de LDAP : schéma, syntaxes, OID, objectclass Exemple : objectclass ( NAME inetorgperson DESC RFC2798: Internet Organizational Person SUP organizationalperson STRUCTURAL MAY ( audio $ businesscategory $ carlicense $ departmentnumber $ displayname $ employeenumber $ employeetype $ givenname $ homephone $ homepostaladdress $ initials $ jpegphoto $ labeleduri $ mail $ manager $ mobile $ o $ pager $ photo $ roomnumber $ secretary $ uid $ usercertificate $ x500uniqueidentifier $ preferredlanguage $ usersmimecertificate $ userpkcs12 ) ) objectclass ( NAME posixaccount SUP top AUXILIARY DESC Abstraction of an account with POSIX attributes MUST ( cn $ uid $ uidnumber $ gidnumber $ homedirectory ) MAY ( userpassword $ loginshell $ gecos $ description ) ) l attribut «uid» sera de type MUST. c T.Besançon (v ) Administration UNIX ARS Partie / 468

360 14 Mécanisme d authentification réseau : LDAP 14.6 Modèle de données de LDAP : schéma, syntaxes, OID, objectclass Les objectclass de LDAP s inscrivent dans un hiérarchie dont la racine est l objectclass «top». Chaque classe hérite d une seule classe mère. Chaque classe peut donner lieu à plusieurs sous classes. (Abstract) top (Structural) person (Auxiliary) companyperson (Structural) organizationalperson (Structural) residentialperson c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.7 Protocole LDAP / Bind Chapitre 14 Mécanisme d authentification réseau : LDAP 14.7 Protocole LDAP / Bind Au niveau réseau : LDAP : TCP port 389 LDAP + SSL : TCP port 636 ( syntaxe LDAP au format ASN.1 ) + BER Un dialogue LDAP s établit après une phase d ouverture de session dite bind. Le bind peut être anonyme ou authentifié. c T.Besançon (v ) Administration UNIX ARS Partie / 468

361 14 Mécanisme d authentification réseau : LDAP 14.8 Format de données LDIF Chapitre 14 Mécanisme d authentification réseau : LDAP 14.8 Format de données LDIF Problème : comment manipuler les objets LDAP en pratique? Réponse : en les manipulant au format LDAP Data Interexchange Format, dit LDIF LDIF n intervient pas dans le protocole LDAP (pas de mention dans les RFC par exemple). LDIF n est compris que par les utilitaires qui le convertissent en protocole LDAP. c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.8 Format de données LDIF Attention aux caractères non ASCII : si la valeur d un attribut est uniquement composé de caractères ASCII, on l écrit «attribut : valeur» si la valeur d un attribut contient des caractères non ASCII, il faut coder cette valeur en UTF-8 puis la coder en BASE64 et écrire au final «attribut :: valeur2» Par exemple l attribut «description» de valeur Université de Paris-Sorbonne, Paris 4 ne sera pas codé en LDIF sous la forme description: Université de Paris-Sorbonne, Paris 4 mais sous la forme description:: VW5pdmVyc2l0w6kgZGUgUGFyaXMtU29yYm9ubmUsIFBhcmlzIDQ=! Notez les différences! c T.Besançon (v ) Administration UNIX ARS Partie / 468

362 14 Mécanisme d authentification réseau : LDAP 14.8 Format de données LDIF 2 utilitaires pratiques : A verifier... c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.8 Format de données LDIF Exemple d une DSE avec des caractères accentués non encore codés en LDIF : dn: ou=personnel,dc=paris4,dc=sorbonne,dc=fr objectclass: top objectclass: organizationalunit ou: Personnels de l Université de Paris-Sorbonne, Paris 4 businesscategory: academic research telephonenumber: +33 (0) facsimiletelephonenumber: +33 (0) postofficebox: Université de Paris-Sorbonne, Paris 4 postalcode: F postaladdress: 1 rue Victor Cousin l: Paris, France description: Université de Paris-Sorbonne, Paris 4 c T.Besançon (v ) Administration UNIX ARS Partie / 468

363 14 Mécanisme d authentification réseau : LDAP 14.8 Format de données LDIF Exemple d une DSE au format LDIF : dn: ou=personnel,dc=paris4,dc=sorbonne,dc=fr objectclass: top objectclass: organizationalunit ou:: UGVyc29ubmVscyBkZSBsJ1VuaXZlcnNpdMOpIGRlIFBhcmlzLVNvcmJvbm5lLCBQYXJpcyA0 businesscategory: academic research telephonenumber: +33 (0) facsimiletelephonenumber: +33 (0) postofficebox:: VW5pdmVyc2l0w6kgZGUgUGFyaXMtU29yYm9ubmUsIFBhcmlzIDQ= postalcode: F postaladdress: 1 rue Victor Cousin l: Paris, France description:: VW5pdmVyc2l0w6kgZGUgUGFyaXMtU29yYm9ubmUsIFBhcmlzIDQ= c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP 14.9 Implémentations Chapitre 14 Mécanisme d authentification réseau : LDAP 14.9 Implémentations Il existe plusieurs implémentations de LDAP : OpenLdap, version (au 21 août 2002) SUN ONE (anciennement Netscape Directory Server, racheté par SUN devenu Sun Iplanet Directory puis SUN ONE) incorporé de base dans Solaris 8 et ultérieur Novell Directory Services, version 4? autes annuaires commerciaux... Les différentes implémentations respectent les normes du protocole. Par contre, elles différent au niveau de tout ce qui n est pas norme. En particulier, les droits d accès aux données sont codés de façon incompatible. c T.Besançon (v ) Administration UNIX ARS Partie / 468

364 14 Mécanisme d authentification réseau : LDAP OpenLDAP Chapitre 14 Mécanisme d authentification réseau : LDAP OpenLDAP Cf Les versions 2.x.y d OpenLDAP sont compatibles avec les normes de LDAP v3. Le logiciel se compose de : du serveur LDAP «slapd» du serveur de synchronisation «slurpd» d utilitaires («slapadd», «ldapsearch», «ldapadd», «ldapdelete», «ldapmodify», «ldappasswd», etc.) librairies, include LDAP un fichier de configuration «slapd.conf» dans lequel on définit le suffixe, le rootdn, le mot de passe du rootdn c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP OpenLDAP Le mécanisme de réplication de serveurs OpenLDAP est le suivant : 1) demande de modification 2) réponse : referral slapd (Esclave) 7) slurpd client 6) 3) demande de modification 4) réponse (OK/not OK) slapd (Maitre) 5) Journal des modifications c T.Besançon (v ) Administration UNIX ARS Partie / 468

365 14 Mécanisme d authentification réseau : LDAP ObjectClass posixaccount, shadowaccount Chapitre 14 Mécanisme d authentification réseau : LDAP ObjectClass posixaccount, shadowaccount Cf RFC2307 Cf le schéma «nis.schema» dans OpenLDAP. L objectclass «posixaccount» est l objet qui implémente l équivalent de la structure C de «<pwd.h>» : objectclass ( NAME posixaccount SUP top AUXILIARY DESC Abstraction of an account with POSIX attributes MUST ( cn $ uid $ uidnumber $ gidnumber $ homedirectory ) MAY ( userpassword $ loginshell $ gecos $ description ) ) L objectclass «shadowaccount» est l objet qui implémente le principe des shadow passwds : objectclass ( NAME shadowaccount SUP top AUXILIARY DESC Additional attributes for shadow passwords MUST uid MAY ( userpassword $ shadowlastchange $ shadowmin $ shadowmax $ shadowwarning $ shadowinactive $ shadowexpire $ shadowflag $ description ) ) c T.Besançon (v ) Administration UNIX ARS Partie / Mécanisme d authentification réseau : LDAP Un peu de bibliographie Chapitre 14 Mécanisme d authentification réseau : LDAP Un peu de bibliographie c T.Besançon (v ) Administration UNIX ARS Partie / 468

366 Chapitre 15 Sélection de naming services, /etc/nsswitch.conf Explications sur le pourquoi de ce chapitre On a vu plusieurs sources d authentification. Comment s organiser avec tout cela maintenant? c T.Besançon (v ) Administration UNIX ARS Partie / Sélection de naming services, /etc/nsswitch.conf 15.1 Problématique Chapitre 15 Sélection de naming services, /etc/nsswitch.conf 15.1 Problématique Exemple : il y a les fichiers système («/etc/passwd», «/etc/hosts», «/etc/services»,...) il y a le DNS il y a NIS il y a NIS+ il y a LDAP etc. Comment choisir quels services répondront aux requêtes de recherche de nom? Une solution : préciser quels naming services seront utilisés et dans quel ordre au niveau du fichier «/etc/nsswitch.conf» (en anglais naming service switch). c T.Besançon (v ) Administration UNIX ARS Partie / 468

367 15 Sélection de naming services, /etc/nsswitch.conf 15.2 Syntaxe de Chapitre 15 Sélection de naming services, /etc/nsswitch.conf 15.2 Syntaxe de /etc/nsswitch.conf Le fichier est au format suivant : service: source [ status=action status=action... ] source... avec : pour source l un des mots clef «files», «dns», «ldap», «nis», «nisplus», «xfn» (liste à vérifier selon les systèmes UNIX offrant plus ou moins de ces services) pour status l un des mots clef suivants : «SUCCESS», entrée recherchée trouvée «NOTFOUND», entrée recherchée non trouvée «UNAVAIL», la source n est pas configurée sur ce système ou bien elle est défaillante «TRYAGAIN», la source est occupée et ne peut pas répondre actuellement, peut-être plus tard c T.Besançon (v ) Administration UNIX ARS Partie / Sélection de naming services, /etc/nsswitch.conf 15.2 Syntaxe de /etc/nsswitch.conf pour action l un des mots clefs : «return», retourner la valeur trouvée ou la non valeur «continue», essayer la source suivante «forever» (uniquement pour «TRYAGAIN»), persister sur cette source jusqu à avoir une réponse Par défaut, on a pour chaque source : [SUCCESS=return NOTFOUND=continue UNAVAIL=continue TRYAGAIN=forever] c T.Besançon (v ) Administration UNIX ARS Partie / 468

368 15 Sélection de naming services, /etc/nsswitch.conf 15.3 Exemple de Chapitre 15 Sélection de naming services, /etc/nsswitch.conf 15.3 Exemple de /etc/nsswitch.conf (pris sur SOLARIS) passwd: files ldap group: files ldap hosts: ldap [NOTFOUND=return] files ipnodes: files networks: ldap [NOTFOUND=return] files protocols: ldap [NOTFOUND=return] files rpc: ldap [NOTFOUND=return] files ethers: ldap [NOTFOUND=return] files netmasks: ldap [NOTFOUND=return] files bootparams: ldap [NOTFOUND=return] files publickey: ldap [NOTFOUND=return] files netgroup: ldap automount: files ldap aliases: files ldap... c T.Besançon (v ) Administration UNIX ARS Partie / Sélection de naming services, /etc/nsswitch.conf 15.4 A propos de LDAP Chapitre 15 Sélection de naming services, /etc/nsswitch.conf 15.4 A propos de LDAP Pour les systèmes n incorporant pas LDAP en natif dans l OS, se reporter à : « « c T.Besançon (v ) Administration UNIX ARS Partie / 468

369 Chapitre 16 Pluggable Authentication Module (PAM) Explications sur le pourquoi de ce chapitre c T.Besançon (v ) Administration UNIX ARS Partie / Pluggable Authentication Module (PAM) 16.1 Problématique Chapitre 16 Pluggable Authentication Module (PAM) 16.1 Problématique Exemple : Soit une machine dans une université, hébergeant les comptes de 10 professeurs et de 1000 élèves. La machine est équipée d un modem. Les professeurs sont autorisés à se connecter à la machine par modem, pas les élèves. La machine est cliente NIS. Quand on se connecte par modem sur une machine, le système lance la commande «login» lorsque la connexion s établit. Comment implémenter cela? En modifiant le programme «login» pour l adapter à ce cas très particulier??? c T.Besançon (v ) Administration UNIX ARS Partie / 468

370 16 Pluggable Authentication Module (PAM) 16.1 Problématique La problématique en général : Comment changer une méthode d authentification dans un programme (par exemple FTP) sans avoir à tout reprogrammer? Solution développée par SUN à l origine et reprise et encouragée dans Linux : Pluggable Authentication Module dit PAM c T.Besançon (v ) Administration UNIX ARS Partie / Pluggable Authentication Module (PAM) 16.2 Principe de PAM Chapitre 16 Pluggable Authentication Module (PAM) 16.2 Principe de PAM L authentication fait appel par l intermédiaire de PAM à des modules externes de code d authentification appropriée selon le service. On déporte l authentification en dehors du programme. /etc/pam.conf programme1 programme2 modulea.so moduleb.so modulec.so moduleb.so modulec.so programme1 pam_init() pam_auth() moduled.so modulee.so modulef.so programme2 pam_init() pam_auth() modulea.so moduleb.so modulec.so moduled.so modulee.so modulef.so c T.Besançon (v ) Administration UNIX ARS Partie / 468

371 16 Pluggable Authentication Module (PAM) 16.2 Principe de PAM 4 catégories de modules PAM : module d authentification (authentication) fonctionnalités pour authentifier un utilisateur et définir ses créances module de gestion de compte (account management) fonctionnalités pour déterminer si l utilisateur dispose d un compte valide (car possibilité d expiration de mot de passe dit password aging, de restrictions d accès horaire) module de gestion de session (session management) fonctionnalités pour définir et terminer les sessions utilisateur module de gestion de mot de passe (password management) fonctionnalités pour changer un mot de passe utilisateur et certaines caractéristiques du compte Pour une certaine application, on organise les modules nécessaires sous forme d une pile et chaque module de la pile va être essayé pour constituer l authentification demandée. Selon la configuration, un utilisateur pourra être amené à rentrer plusieurs mots de passe. c T.Besançon (v ) Administration UNIX ARS Partie / Pluggable Authentication Module (PAM) 16.3 Configuration de : /etc/pam.conf, /etc/pam.d Chapitre 16 Pluggable Authentication Module (PAM) 16.3 Configuration de PAM : /etc/pam.conf, /etc/pam.d «/etc/pam.conf» définit quels modules seront utilisés pour chaque application. Sur Linux, on trouve aussi le répertoire «/etc/pam.d» qui contient un fichier par application portant le nom de l application. Ainsi «/etc/pam.d/login» pour le service «login». Une ligne de «/etc/pam.conf» contient 5 champs : service_name module_type control_flag module_path options c T.Besançon (v ) Administration UNIX ARS Partie / 468

372 16 Pluggable Authentication Module (PAM) 16.3 Configuration de PAM : /etc/pam.conf, /etc/pam.d service_name module_type control_flag module_path options Le «service_name» nomme le service concerné par la ligne («other» pour service joker) Le «module_type» est l un des 4 mots clef : «auth», «account», «session», «password» Le «control_flag» est l un des 4 mots clef : «requisite», «required», «optional», «sufficient» Le «module_path» est le chemin du module. Les options dépendent du module. c T.Besançon (v ) Administration UNIX ARS Partie / Pluggable Authentication Module (PAM) 16.3 Configuration de PAM : /etc/pam.conf, /etc/pam.d Par exemple, le service «login» fait appel aux modules suivants : # Authentication management login auth required /usr/lib/security/pam_unix.so.1 login auth required /usr/lib/security/pam_dial_auth.so.1 # Account management login account requisite /usr/lib/security/pam_roles.so.1 login account required /usr/lib/security/pam_projects.so.1 login account required /usr/lib/security/pam_unix.so.1 # Session management other session required /usr/lib/security/pam_unix.so.1 # Password management other password required /usr/lib/security/pam_unix.so.1 c T.Besançon (v ) Administration UNIX ARS Partie / 468

373 16 Pluggable Authentication Module (PAM) 16.4 Directives d essai des modules Chapitre 16 Pluggable Authentication Module (PAM) 16.4 Directives d essai des modules Les directives possibles d essai des modules sont : directive «required» directive «requisite» directive «optional» directive «sufficient» c T.Besançon (v ) Administration UNIX ARS Partie / Pluggable Authentication Module (PAM) 16.4 Directives d essai des modules directive «required» La valeur de retour de ce module doit être «PAM_SUCCESS» pour sortir de la pile d authentification avec succès ; «PAM_AUTH_ERR» fera recommencer toute la pile : Pile de modules PAM REQUIRED SUCCESS AUTH_ERR c T.Besançon (v ) Administration UNIX ARS Partie / 468

374 16 Pluggable Authentication Module (PAM) 16.4 Directives d essai des modules directive «requisite» Une valeur de retour «PAM_AUTH_ERR» fait sortir de la pile d authentification prématurément en échec : Pile de modules PAM REQUISITE SUCCESS AUTH_ERR c T.Besançon (v ) Administration UNIX ARS Partie / Pluggable Authentication Module (PAM) 16.4 Directives d essai des modules directive «optional» Si ce module échoue, on sortira de la pile avec succès si un autre module dans la pile réussit : Pile de modules PAM OPTIONAL SUCCESS AUTH_ERR c T.Besançon (v ) Administration UNIX ARS Partie / 468

375 16 Pluggable Authentication Module (PAM) 16.4 Directives d essai des modules directive «sufficient» Une valeur de retour «PAM_SUCCESS» de ce module fait sortir de la pile d authentification prématurément avec succès ; les autres modules dans la pile ne sont pas pris en compte : Pile de modules PAM SUFFICIENT AUTH_ERR SUCCESS c T.Besançon (v ) Administration UNIX ARS Partie / Pluggable Authentication Module (PAM) 16.5 Modules, /usr/lib/security Chapitre 16 Pluggable Authentication Module (PAM) 16.5 Modules, /usr/lib/security Les modules sont conventionnellement stockés dans «/usr/lib/security/» Par exemple sur Solaris : % ls /usr/lib/security amiserv pam_ldap.so.1 pam_sample.so.1 pam_ami.so pam_projects.so pam_smartcard.so pam_ami.so.1 pam_projects.so.1 pam_smartcard.so.1 pam_dial_auth.so pam_rhosts_auth.so pam_unix.so pam_dial_auth.so.1 pam_rhosts_auth.so.1 pam_unix.so.1 pam_krb5.so pam_roles.so sparcv9 pam_krb5.so.1 pam_roles.so.1 pam_ldap.so pam_sample.so Chaque module fournit l implémentation d un mécanisme spécifique. c T.Besançon (v ) Administration UNIX ARS Partie / 468

376 16 Pluggable Authentication Module (PAM) 16.5 Modules, /usr/lib/security «/usr/lib/security/pam_unix.so.1» fournit un suport d authentification, gestion de compte, session de mot de passe. Il utilise les mots de passe UNIX pour l authenfication. «/usr/lib/security/pam_dial_auth.so.1» peut seulement être utilisé pour l authentification. Il utilise des données stockées dans «/etc/dialups» et «/etc/d_passwd». Principalement utilisé par «login». «/usr/lib/security/pam_rhosts_auth.so.1» peut seulement être utilisé pour l authentification. Il utilise les données stockées dans les fichiers «.rhosts» et «/etc/hosts.equiv». Principalement utilisé par «rlogin» et «rsh». c T.Besançon (v ) Administration UNIX ARS Partie / Pluggable Authentication Module (PAM) 16.6 Options des modules Chapitre 16 Pluggable Authentication Module (PAM) 16.6 Options des modules On peut passer certaines options aux modules des options spécifiques à chaque module ; cf la documentation de chaque module ; par exemple «retry=3» ou «debug» option «use_first_pass» Cette option indique d utiliser exclusivement le mot de passe entré pour le premier module de la pile du service. option «try_first_pass» Cette option indique d utiliser d abord le mot de passe entré pour le premier module de la pile du service et en cas d échec de ce mot de passe d en demander un autre. (Le support des options «use_first_pass» et «try_first_pass» est fortement conseillé auprès des développeurs de modules PAM ; à vérifier donc avec chaque module) c T.Besançon (v ) Administration UNIX ARS Partie / 468

377 16 Pluggable Authentication Module (PAM) 16.7 Exemple 1 Chapitre 16 Pluggable Authentication Module (PAM) 16.7 Exemple 1 Extrait de «/etc/pam.conf» : # Authentication management login auth required /usr/lib/security/pam_unix.so.1 login auth required /usr/lib/security/pam_dial_auth.so.1 Fichier «/etc/dialups» : /dev/pts/9 Fichier «/etc/d_passwd» : /bin/bash:nuemrw70uy9m.: c T.Besançon (v ) Administration UNIX ARS Partie / Pluggable Authentication Module (PAM) 16.7 Exemple 1 Session interactive : % tty /dev/pts/9 % exec login exec login login: besancon Password: XXXXXXXX <-- mot de passe Dialup Password: YYYYYYYY <-- mot de passe %% <-- connexion établie, shell lancé On voit bien la ligne supplémentaire «Dialup Password:» c T.Besançon (v ) Administration UNIX ARS Partie / 468

378 16 Pluggable Authentication Module (PAM) 16.7 Exemple 1 Si l on se trompe dans l un des mots de passe, toutes les demandes de mot de passe sont réessayées : % exec login login: besancon Password: ZZZZZZZZ <-- mauvais mot de passe Dialup Password: YYYYYYYY <-- mot de passe OK Login incorrect login: besancon Password: XXXXXXXX <-- mot de passe OK Dialup Password: ZZZZZZZZ <-- mauvais mot de passe Login incorrect login: besancon Password: XXXXXXXX <-- mot de passe OK Dialup Password: YYYYYYYY <-- mot de passe OK %% <-- connexion établie, shell lancé Au niveau SYSLOG, ça laisse quelques traces : Aug 20 14:51:14 cerise login: [ID auth.debug] pam_authenticate: error Authentication failed... Aug 20 14:51:34 cerise login: [ID auth.debug] pam_authenticate: error Authentication failed c T.Besançon (v ) Administration UNIX ARS Partie / Pluggable Authentication Module (PAM) 16.8 Exemple 2 Chapitre 16 Pluggable Authentication Module (PAM) 16.8 Exemple 2 Pour autoriser l authentification par LDAP, on mettra dans «/etc/pam.conf» : # Authentication management login auth sufficient /usr/lib/security/pam_unix.so.1 login auth required /usr/lib/security/pam_ldap.so.1 use_first_pass c T.Besançon (v ) Administration UNIX ARS Partie / 468

379 16 Pluggable Authentication Module (PAM) 16.9 A propos de LDAP Chapitre 16 Pluggable Authentication Module (PAM) 16.9 A propos de LDAP Pour les systèmes n incorporant pas LDAP en natif dans l OS, se reporter à : « c T.Besançon (v ) Administration UNIX ARS Partie / Pluggable Authentication Module (PAM) Un peu de bibliographie Chapitre 16 Pluggable Authentication Module (PAM) Un peu de bibliographie c T.Besançon (v ) Administration UNIX ARS Partie / 468

380 Chapitre 17 Monitoring de systèmes Explications sur le pourquoi de ce chapitre SYSLOG est un outil important mais il est primitif et ne permet pas une surveillance du réseau : en gros SYSLOG envoit un message quand cela dysfonctionne et pas avant. On ne surveille pas un réseau avec SYSLOG. Pour surveiller, divers outils sont nécessaires : tracer des schémas tracer des courbes à partir de données statiques tracer des courbes à partir de données dynamiques surveiller ces courbes à partir de données dynamiques surveiller les composants (hardware, software) des systèmes, remonter les alertes en escalade, etc. c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.1 Logiciels de dessin de réseaux Chapitre 17 Monitoring de systèmes 17.1 Logiciels de dessin de réseaux Plusieurs logiciels disponibles mais incompatibles : VISIO, disponible dans Microsoft Office sous Windows : « DIA, disponible sur UNIX ou Windows : « c T.Besançon (v ) Administration UNIX ARS Partie / 468

381 17 Monitoring de systèmes 17.1 Logiciels de dessin de réseaux Bibliothèque de symboles CISCO dans DIA : c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.2 Logiciels de tracés de courbes Chapitre 17 Monitoring de systèmes 17.2 Logiciels de tracés de courbes Objectif : visualiser certains aspects système : pagination débits réseau utilisation de jetons logiciels températures de salles machines etc. Avec quel(s) outil(s) tracer les mesures réalisées? GNUPLOT MRTG RRDTOOL etc. c T.Besançon (v ) Administration UNIX ARS Partie / 468

382 17 Monitoring de systèmes 17.3 GNUPLOT Chapitre 17 Monitoring de systèmes 17.3 GNUPLOT Site « Aussi disponible sur Windows. Exemple (tracé d une donnée en fonction du temps) : 500 "data" using 1: / / / / / / / / / /29 c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.3 GNUPLOT Gnuplot utilise un langage scriptable pour dessiner les courbes. Exemple de données : Script : set terminal png set output "data.png" set timefmt "%Y%m%d" set xdata time set format x "%Y\n%m/%d" set yrange [0:500] plot "data" using 1:2 with linespoints c T.Besançon (v ) Administration UNIX ARS Partie / 468

383 17 Monitoring de systèmes 17.3 GNUPLOT Image «data.png» obtenue par «gnuplot consommation.gnuplot» : 500 "data" using 1: / / / / / / / / / /29 c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.4 MRTG (Multi Router Traffic Grapher) Chapitre 17 Monitoring de systèmes 17.4 MRTG (Multi Router Traffic Grapher) MRTG (Multi Router Traffic Grapher) : site « c T.Besançon (v ) Administration UNIX ARS Partie / 468

384 17 Monitoring de systèmes 17.5 RRDTOOL Chapitre 17 Monitoring de systèmes 17.5 RRDTOOL RRDTOOL : site « c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.5 RRDTOOL c T.Besançon (v ) Administration UNIX ARS Partie / 468

385 17 Monitoring de systèmes 17.5 RRDTOOL Principe Fichier de données d extension «.rrd». RRD signifie Round Robin Database N slots Une fois les N slots remplis, on vire la valeur la plus vieille et on décale DONNEES 27 DONNEES 26 DONNEES 25 DONNEES 24 DONNEES 23 DONNEES 22 DONNEES 27 DONNEES 26 DONNEES 25 DONNEES 24 DONNEES 23 DONNEES 22 c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.5 RRDTOOL Possibilité de «consolider» des données. Par exemple : échantillonnage toutes les 5 minutes consolidation : 1 donnée par heure consolidation : 1 donnée par jour consolidation : 1 donnée par semaine Objectif des consolidations : obtenir des graphes sur différentes périodes de temps sans avoir à mémoriser toutes les données échantillonnées de façon à avoir un fichier de données de taille qui reste raisonnable : graphe des N dernières heures graphe des N derniers jours graphe des N dernières semaines graphe des N derniers mois c T.Besançon (v ) Administration UNIX ARS Partie / 468

386 17 Monitoring de systèmes 17.6 Logiciels de surveillance Chapitre 17 Monitoring de systèmes 17.6 Logiciels de surveillance Plusieurs solutions : produits maison logiciels clef en main framework permettant des adaptations maison etc. Dans la dernière catégorie : NAGIOS CACTI HP OpenView BMC Patrol etc. c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.7 NAGIOS Chapitre 17 Monitoring de systèmes 17.7 NAGIOS Site « Principe : la «console» NAGIOS surveille des clients via des «agents logiciels» NAGIOS : Console Console Console indirecte host 1 host 2 host 1 host 2 c T.Besançon (v ) Administration UNIX ARS Partie / 468

387 17 Monitoring de systèmes 17.7 NAGIOS La console NAGIOS peut avoir une interface web (serveur web APACHE + scripts CGI) mais non obligatoire. Interface web simple. Le gros du travail : définir le réseau : liste des définitions des machines liste des définitions des services à surveiller sur les machines installer les agents NAGIOS c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.7 NAGIOS NAGIOS commence à être connu et intégré à des réalisations commerciales. c T.Besançon (v ) Administration UNIX ARS Partie / 468

388 17 Monitoring de systèmes 17.7 NAGIOS Exemples c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.7 NAGIOS c T.Besançon (v ) Administration UNIX ARS Partie / 468

389 17 Monitoring de systèmes 17.7 NAGIOS c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.7 NAGIOS c T.Besançon (v ) Administration UNIX ARS Partie / 468

390 17 Monitoring de systèmes 17.7 NAGIOS Objets NAGIOS host hostgroup service servicegroup contact contactgroup timeperiod command servicedependency (dépendance entre services) servicescalation hostdependency (dépendance entre machines) hostescalation hostextinfo (extended host information : icône, URL) serviceextinfo (extended service information : icône, URL) Il est possible d avoir des squelettes pour faciliter les définitions des objets en regroupant dans un squelette des caractéristiques communes. c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.7 NAGIOS Définition d un template Cas d un objet host : define host{ name generic-host ; Name of template register 0 ; DON T REGISTER ; IT IS JUST A TEMPLATE } notifications_enabled 1 ; Host notifications are enabled event_handler_enabled 1 ; Host event handler is enabled flap_detection_enabled 1 ; Flap detection is enabled failure_prediction_enabled 1 ; Failure prediction is enabled process_perf_data 1 ; Process performance data retain_status_information 1 ; Retain status information across program restarts retain_nonstatus_information 1 ; Retain non-status information across program restart notification_period 24x7 ; Send host notifications at any time c T.Besançon (v ) Administration UNIX ARS Partie / 468

391 17 Monitoring de systèmes 17.7 NAGIOS Possibilité de template imbriqué : define host { name server ; Name of template use generic-host ; Inherits from generic-host register 0 ; DON T REGISTER ; IT IS JUST A TEMPLATE } check_period 24x7 ; By default, Linux hosts are checked round the c max_check_attempts 10 ; Check each Linux host 10 times (max) check_command check-host-alive ; Default command to check Linux hosts notification_period workhours notification_interval 120 ; Resend notification every 2 hours notification_options d,u,r ; Only send notifications for specific host state contact_groups admins ; Notifications get sent to the admins by default c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.7 NAGIOS Définition d un host (machine) define host{ use server } host_name alias address mail.example.com smtp.example.com mail.example.com Etats d un host : «OK» «DOWN» «UNREACHABLE» «RECOVERING» c T.Besançon (v ) Administration UNIX ARS Partie / 468

392 17 Monitoring de systèmes 17.7 NAGIOS Définition d un service define service { use host_name service_description check_command } define service { use host_name service_description check_command } local-service mail.example.com SSH check_ssh local-service mail.example.com CAMCONTROL check_nrpe!check_camcontrol_da0 Etats d un service : «OK» «WARNING» «CRITICAL» «UNKNOWN» «RECOVERING» c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.7 NAGIOS Définition d une commande Un service NAGIOS fait appel à une commande NAGIOS. Commande NAGIOS = plugin NAGIOS = agent NAGIOS Nombreuses commandes fournies. define command { command_name check_ssh command_line $USER1$/check_ssh -H $HOSTADDRESS$ } define command { command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ } Facile d ajouter ses commandes : define command { command_name check-netapp-globalstatusmsg command_line $USER1$/check_snmp -H $HOSTADDRESS$ -C public -o } c T.Besançon (v ) Administration UNIX ARS Partie / 468

393 17 Monitoring de systèmes 17.7 NAGIOS Plusieurs protocoles de communication entre NAGIOS et ses plugins. NAGIOS external command file check_xyz (plugin) check_by_ssh (plugin) check_nrpe (plugin) check_snmp (plugin) NSCA (daemon) service sshd (daemon) nrpe (daemon) snmpd (daemon) send_nsca (client) check_xyz (plugin) check_xyz (plugin) result of service check client client client client client c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.7 NAGIOS NRPE : NAGIOS Remote Plugin Executor port TCP 5666 monitoring actif : NAGIOS est à l initiative des tests NSCA : NAGIOS Service Check Acceptor monitoring passif ; NAGIOS reçoit des résultats de tests dont il n est pas l initiateur c T.Besançon (v ) Administration UNIX ARS Partie / 468

394 17 Monitoring de systèmes 17.7 NAGIOS Il est facile d écrire un plugin : #!/bin/sh if [... ] then # condition OK echo "OK - blabla" exit 0 fi if [... ] then # condition WARNING echo "WARNING - blabla" exit 1 fi if [... ] then # condition ERROR echo "CRITICAL - blabla" exit 1 fi if [... ] then # condition UNKNOWN echo "UNKNOWN - blabla" exit 3 fi N importe quel langage de programmation peut convenir. c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.7 NAGIOS Définition d un contact define contact{ contact_name alias service_notification_period host_notification_period service_notification_options host_notification_options service_notification_commands host_notification_commands } nagios-admin Nagios Admin 24x7 24x7 w,u,c,r d,r notify-by- host-notify-by- [email protected] c T.Besançon (v ) Administration UNIX ARS Partie / 468

395 17 Monitoring de systèmes 17.7 NAGIOS Notification via From Fri Aug 3 12:10: Date: Fri, 3 Aug :10: (CEST) From: [email protected] To: [email protected] Subject: ::NAGIOS:: ** PROBLEM alert - mail.example.com/camcontrol is CRITICAL ** ***** Nagios 2.6 ***** Notification Type: PROBLEM Service: CAMCONTROL Host: mail.example.com Address: mail.example.com State: CRITICAL Date/Time: Fri Aug 3 12:10:02 CEST 2007 Additional Info: CAMCONTROL ALERT :09:28 da0 [COMPAQ RAID 1 VOLUME reco] c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.7 NAGIOS Notification via (suite) From [email protected] Fri Aug 3 12:24: Date: Fri, 3 Aug :25: (CEST) From: [email protected] To: [email protected] Subject: ::NAGIOS:: ** RECOVERY alert - mail.example.com/camcontrol is OK ** ***** Nagios 2.6 ***** Notification Type: RECOVERY Service: CAMCONTROL Host: mail.example.com Address: mail.example.com State: OK Date/Time: Fri Aug 3 12:25:02 CEST 2007 Additional Info: CAMCONTROL OK :24:28 da0 [COMPAQ RAID 1 VOLUME OK] c T.Besançon (v ) Administration UNIX ARS Partie / 468

396 17 Monitoring de systèmes 17.8 NAGIOS + OREON Chapitre 17 Monitoring de systèmes 17.8 NAGIOS + OREON L interface de NAGIOS n est pas très pratique. Possibilité d offrir une autre interface web : OREON Site « c T.Besançon (v ) Administration UNIX ARS Partie / Monitoring de systèmes 17.9 CACTI Chapitre 17 Monitoring de systèmes 17.9 CACTI CACTI : Site « c T.Besançon (v ) Administration UNIX ARS Partie / 468

Montrer encore