************************************************************************ * Myricom GM networking software and documentation * * Copyright (c) 2005 by Myricom, Inc. * * All rights reserved. See the file `COPYING' for copyright notice. * ************************************************************************ README-aix for gm-2.0 Supported OS/processors: Power4, Power5, PowerPC-970 running AIX 5.2 (32-bit or 64-bit OS) Explicitly tested on JS20, 7028-6C4 Power4, P550 Squadron and various other Squadron models. Other AIX platforms have not been tested, but may be supportable by this binary. If you need support for such platforms, please contact help@myri.com. Supported NICs: PCIXD, PCIXF PCI64-based NICs (PCI64, PCI64A, PCI64B, PCI64C) are not supported. Note: gm-2.0 does not interoperate with gm-1.x. A mixture of hosts with gm-1.x and gm-2.0 cannot talk to each other. Table of Contents: ----------------- I. GM Binary Installation a. Unpacking the GM driver b. Installing the GM driver and software c. Enabling IP over Myrinet (Ethernet emulation) (OPTIONAL) d. Testing the GM Installation II. Verifying GM performance III. Improving IP Performance IV. Cautions/Caveats a. 32/64-bit Issues b. Shared GM Library V. Miscellaneous a. Stopping GM software and Unloading the GM driver b. De-installation of the GM Software c. Log Files ************************************************************************ If difficulties are encountered, please consult the FAQ http://www.myri.com/scs/FAQ/ and all technical support questions should be directed to help@myri.com. ************************************************************************ ========================= I. GM Binary Installation ========================= These GM binaries assume the use of PCIX-based NICs. PCI64 and PCI32 based NICs are not supported. -------------------------------------- a. Unpacking the GM driver -------------------------------------- gunzip gm-2.0.21_AIX.tar.gz tar xvf gm-2.0.21_AIX.tar This will create a directory called gm-2.0.21_AIX under the current working directory. ----------------------------------------- b. Installing the GM driver and software ----------------------------------------- The GM driver and associated software is shipped as an AIX Licensed Program Product (LPP). To install the GM LPP and start the mappers on an AIX system where GM is not currently installed: su root cd gm-2.0.21_AIX installp -a -d devices.pci.c1144380 all cfgmgr /opt/gm/sbin/gm_start After installation, all of the GM software will be contained under the directory /opt/gm except for the driver itself and its config and unconfig methods. If there is already a version of the GM LPP installed, the old version should be uninstalled before installing the new version. For directions on how to uninstall the GM driver, refer to the "Miscellaneous" section of this README-aix. Once the GM LPP is installed, subsequent reboots will automatically load the gm driver and start the mappers. ------------------------------------- c. Enabling IP over Myrinet (Ethernet emulation) (OPTIONAL) ------------------------------------- If you would like to run IP over Myrinet (ethernet emulation), you must initially configure an IP and Netmask for the Ethernet emulation device. This can be done with the commands: su root smit tcpip --> Minimum Configuration & Startup Assuming that AIX has assigned the name "ent2" to the Myrinet device, IP over Myrinet is enabled as follows: ifconfig en2 up The following commands terminate and detach IP over Myrinet: ifconfig en2 detach For suggestions on improving performance, please refer to section "III. Improving IP Performance". ----------------------------------------- d. Testing the GM Installation ----------------------------------------- Once the GM software has been properly installed on all of the hosts in your cluster, you are ready to validate your Myrinet installation by performing the following sequence of tests. * Check the LEDs on each switch port and NIC port * Run gm_board_info on one host * Run gm_debug to test the PCI bandwidth * Run gm_allsize to test the links in the network * Run gm_stress to test the network Each of these steps is detailed in the Troubleshooting section of the FAQ http://www.myri.com/scs/FAQ/ The test programs (gm_board_info, gm_debug, gm_allsize, gm_stress) are available in /opt/gm/bin. A README describing each of these tests can be found in /opt/gm//bin/README. ============================= II. Verifying GM Performance ============================= We recommend the following test to verify the GM performance. cd /opt/gm/bin ./gm_debug --no-counters This gm_debug test displays the results of the hardware benchmark test of the PCI bus with the DMA engine of the Myrinet NIC. The output of this command indicates the maximum sustained bandwidth that can be obtained from the PCI bus, and thus provides an upper bound on GM performance. A detailed description of this benchmark can be found in the FAQ (http://www.myri.com/scs/FAQ/). The output of this command also tells you if the Myrinet NIC was correctly detected as 64-bit / 133 MHz, for example. If the NIC was not correctly detected by the BIOS, you should suspect a riser card problem or a PCI slot problem. Performance graphs (http://www.myri.com/myrinet/performance) for GM on Linux are available. The performance measurements were obtained by running gm_allsize tests for latency and bandwidth as described in the FAQ entry ("What are the run-time options to gm_allsize?"). ------------------------------------- III. Improving IP Performance ------------------------------------- In order to improve performance, consider enabling window scaling and timestamps on all nodes. On AIX this can be done with the command 'ifconfig enX rfc1323 1', where 'X' corresponds to the Myrinet Ethernet interface. Also, if there are Linux nodes in the cluster, you should disable TCP reno code on AIX with the command 'no -o tcp_newreno=0'. Consult the "Running IP" section of the FAQ (http://www.myri.c for other related questions. ==================== IV. Cautions/Caveats ==================== ------------------------------------- a. 32/64-bit Issues ------------------------------------- The GM driver and library run on both 32-bit and 64-bit kernels. The gm driver contains both 32-bit and 64-bit objects. The mode of the driver that is loaded will always match the mode of the kernel that is running. To determine the kernel mode, use the AIX command 'prtconf -k'. The shared gm library (/opt/gm/lib/libgm.a) contains both 32-bit and 64-bit shared objects. The library object that is used will always match the mode of the application that is linking to it. To determine the mode of the GM application use the command 'file '. Running 32-bit GM applications on a 64-bit kernel is supported. Running 64-bit GM applications on a 32-bit kernel is _NOT_ supported. Attempting to run 64-bit GM applications on a 32-bit kernel will produce undefined results. For convenience, all of the test binaries in /opt/gm/bin are compiled in 32-bit mode so that they will execute on 32-bit or 64-bit AIX kernels. ------------------------------------- b. Shared GM Library ------------------------------------- The GM library is a shared library called libgm.a in the directory /opt/gm/lib. The LIBPATH environment variable should include '/opt/gm/lib' or the '-L' option should be used when linking GM applications. For convenience, all of the test binaries in /opt/gm/bin are statically linked to the gm library so there is no need to have LIBPATH set in the environment when executing them. =================== V. Miscellaneous =================== ------------------------------------- a. Stopping GM software and Unloading the GM driver ------------------------------------- In order to detach any IP over Myrinet interfaces, stop the mapper or mappers and unload the GM driver: su root /opt/gm/sbin/gm_stop ------------------------------------- b. De-installation of the GM Software ------------------------------------- To De-install the GM driver and software, first unload the GM driver (see above). Then run the following command to remove all the GM software and drivers: su root installp -u devices.pci.c1144380.rte ------------------------------ c. Log Files ------------------------------ On AIX, all GM informational, warning and error messages are sent to the console. If there are problems, it may be necessary to view console messages for any or all nodes in the cluster. Some sort of console redirection or buffering for all nodes should be considered at cluster installation time, before problems occur. GM Map files are stored in the directory /var/adm/ras/gm_mapper in the following format: gm_map_B_P In order to generate output from a mapper for debug purposes, a 'HUP' signal can be sent to the mapper. I.e: kill -HUP This will generate an output file in the /var/adm/ras/gm_mapper directory of the form: verbose_B.out