UncleNUC Wiki

Second chance for NUCs

User Tools

Site Tools


lab:stack_of_nucs:ansible_playbook_-_fah_installation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
lab:stack_of_nucs:ansible_playbook_-_fah_installation [2023/05/19 02:14] – [Check Temperature] userlab:stack_of_nucs:ansible_playbook_-_fah_installation [2024/05/06 02:10] (current) – removed user
Line 1: Line 1:
-====== Ansible Playbook - FAH Installation ====== 
-In our previous step we [[Ansible Playbook - CMOS|checked the health of CMOS batteries]] on our [[start|Stack of NUCs]]. 
  
-Now we are going to create and run an Ansible playbook to set up [[https://foldingathome.org/|Folding at Home (FAH)]] on the nodes. I have updated the playbook by ajacocks to add the current release and hack up a quick fix. 
- 
-Please note that the NUCs in am using this lab have only 4 cores, and for some WU's (work units) the client will only use 3 cores. So don't expected to be scoring many points with these small boxes. 
- 
-Purpose: 
-  * Demonstrate a running a complex workload of a service combined with configuration files 
- 
-References 
-  * [[https://github.com/ajacocks/fah]] 
-====== Step 1 - Install the fahcontrol app on NUC 1 ====== 
-The official download [[https://foldingathome.org/alternative-downloads/?lng=en|here]] does not work with Ubuntu 22.04. Use [[https://github.com/cdberkstresser/fah-control]]. 
- 
-  - Open a shell on [[NUC 1]] 
-  - Install packages 
-    * ''sudo apt-get install -y python3-stdeb python3-gi python3-all python3-six debhelper dh-python gir1.2-gtk-3.0'' 
-  - Clone the repo and run the command 
-    * ''git clone https://github.com/cdberkstresser/fah-control.git'' 
-    * ''cd fah-control'' 
-    * ''./FAHControl'' 
- 
-====== Step 2 - Install the the FAH client using Ansible ====== 
-From [[NUC 1]], log in to the Ansible control node, [[NUC 2]]. 
- 
-  - Change directory to /home/ansible/my-project 
-  - <code>git clone --branch support-7-6-21 https://github.com/doritoes/fah.git</code> 
-  - Change directory to ''/home/ansible/my-project/fah'' 
-  - Modify file ''/home/ansible/my-project/fah/inventory'' 
-    * copy your ansible node IPs from the file /home/ansible/my-project/hosts to the [clients] section 
-    * chost='(IP of Control Node)' 
-    * cpass='(control-node-password)' 
-    * username='(Yourname @ folding@home)' 
-    * team='(if you support a team)' 
-    * passkey='(redacted passkey from folding@home)' 
-  - ''ansible-playbook main.yml'' 
-    * if you encounter a DNS lookup failure on some or all nodes 
-      * your wireless router should be setting DNS information as part of DHCP 
-      * did you disable the DNS stub resolver in earlier steps? 
-    * if you cannot connect with the control app and/or you see an error regarding a locked database 
-      * reboot the node to clear the error 
-      * it seems running the playbook on an already configured system and run multiple copies of FAH and cause the problem; rebooting solves the issue 
-  - Reboot all the clients to ensure the service registers properly and no double processes are running 
-    * ''ansible clients -m reboot'' 
-    * If you want to confirm your FAH configuration copied correctly, see the optional section below 
-  - On [[NUC 1]], open the FAH control program 
-    * Add clients one at a time in FAHControl 
-      * Any name you want 
-      * IP address of the client 
-      * Control password you used configuring FAH 
- 
-====== Next Step ====== 
-Congratulations! Your [[start|Stack of NUCs]] is now fully occupied running a valuable workload! Next up is [[Ansible Playbook - FAH Removal]], where we disable FAH and remove it. 
- 
-====== Optional ====== 
-===== Check FAH Status ===== 
-  - Check the FAHClient status 
-    * ''ansible-playbook checkfahstatus.yml'' 
-    * <file yaml checkfahstatus.yml> 
---- 
-- hosts: clients 
-  become: true 
-  become_user: root 
-  tasks: 
-    - name: Get FAH service Status 
-      ansible.builtin.systemd: 
-        state: "started" 
-        name: "FAHClient" 
-      register: fah_service_status 
-    - name: Show status 
-      debug: 
-        msg: "{{ fah_service_status.status.ActiveState }}" 
-</file> 
-  - Check the config file 
-    * <file yaml check-fah-config.yml> 
-cat check-fah-config.yml 
---- 
-- hosts: clients 
-  become: true 
-  become_user: root 
-  tasks: 
-    - name: Read FAH client from config.xml 
-      shell: cat /etc/fahclient/config.xml 
-      changed_when: false 
-      register: configuration 
-    - name: Dump configuration 
-      debug: 
-        var: configuration.stdout_lines 
-</file> 
-  - If the configuration did not apply successfully, re-configure using the following playbook. Be sure the "team" variable is present in the ''inventory'' file. 
-    * <file yaml reconfigure-fah.yml> 
---- 
-- hosts: all 
-  tasks: 
-    - name: Install FaH config 
-      template: 
-        src: /home/ansible/my-project/fah/roles/fahclient/templates/sample-config.xml.j2 
-        dest: /etc/fahclient/config.xml 
-    - name: Restart FaH 
-      systemd: 
-        name: FAHClient 
-        state: restarted 
-</file> 
-    * You might need to reboot the NUCs, not just the service 
-      * ''ansible clients -m reboot'' 
- 
-===== Work with FAH Commands ===== 
-  - Check points per day (PPD) and queue information: 
-    * <code bash>ansible clients -a "FAHClient --send-command ppd"</code> 
-    * <code bash>ansible clients -a "FAHClient --send-command queue-info"</code> 
-  - Tell all nodes to finish their work unit then pause 
-    * <code bash>ansible -i ../hosts all -a "FAHClient --send-command finish"</code> 
-    * It's good form to finish the work units that are assigned to you before removing FAH from the nodes 
-  - Pausing and unpausing folding 
-    * <code bash>ansible clients -a "FAHClient --send-pause"</code> 
-    * <code bash>ansible clients -a "FAHClient --send-unpause"</code> 
- 
-===== Check Queue State ===== 
-  - Check the FAHClient queue states 
-    * ''ansible-playbook check-fah-queues.yml'' 
-    * <file yaml check-fah-queues.yml> 
---- 
-- name: Check queue 
-  hosts: clients 
-  remote_user: ansible 
-  become: true 
-  tasks: 
-    - name: Gather queue information 
-      shell: "FAHClient --send-command queue-info" 
-      register: fahqueue 
-      changed_when: false 
-    - name: Queue status 
-      debug: 
-        msg: "{{(fahqueue.stdout_lines[4:-1] | join | from_json)[0].state }}" 
-</file> 
-  - If the queue is empty, the test will fail 
-  - If the node has paused folding, the status will be "READY" 
-  - If the node is currently folding, the status will be "RUNNING" 
- 
-===== Check Work Unit ETAs ===== 
-  - Check the FAHClient unit ETAs 
-    * ''ansible-playbook check-fah-eta.yml'' 
-    * <file yaml check-fah-eta.yml> 
---- 
-- name: Check queue ETA 
-  hosts: clients 
-  remote_user: ansible 
-  become: true 
-  tasks: 
-    - name: Gather queue information 
-      shell: "FAHClient --send-command queue-info" 
-      register: fahqueue 
-      changed_when: false 
-    - name: Queue status 
-      debug: 
-        msg: "{{(fahqueue.stdout_lines[4:-1] | join | from_json)[0].eta }}" 
-</file> 
-===== Check CPU Utilization ===== 
-Check the CPU load on the nodes 
-  * ''ansible-playbook getcpu.yml'' 
-  * <file yaml getcpu.yml> 
---- 
-- hosts: all 
-  gather_facts: false 
-  tasks: 
-    - name: Get CPU usage 
-      shell: "top -b -n 1" 
-      register: top 
-      changed_when: false 
-    - name: Set CPU usage facts 
-      set_fact: 
-        user_cpu: "{{ top.stdout_lines[2].split()[1] }}" 
-        system_cpu: "{{ top.stdout_lines[2].split()[3] }}" 
-        nice_cpu: "{{ top.stdout_lines[2].split()[5] }}" 
-    - name: Output CPU usage facts 
-      debug: 
-        msg: 
-          - "User CPU usage: {{ user_cpu }}" 
-          - "System CPU usage: {{ system_cpu }}" 
-          - "Nice CPU usage: {{ nice_cpu }}" 
-</file> 
- 
-===== Check Temperature ===== 
-In this example we will look into monitoring the CPU and chipset temperature of our NUCs. 
-  - Install lm-sensors 
-    * Option 1 - Ad Hoc 
-      * <code>ansible -i hosts all -m apt -a "name=lm-tools state=present"</code> 
-    * Option 2 - Playbook in /home/ansible/my-project/fah/lm-sensors.yml 
-      * <file yaml lm-sensors.yml> 
---- 
-- name: lm-sensors install 
-  hosts: clients 
-  remote_user: ansible 
-  become: true 
-  tasks: 
-    - name: Install lm-sensors 
-      apt: 
-        name: lm-sensors 
-        update_cache: true 
-    - name: Detect sensors 
-      ansible.builtin.command: sensors-detect --auto 
-</file> 
-    * ''ansible-playbook lm-sensors.yml'' 
-  - Check Temperature 
-    * ''ansible clients -a sensors'' 
-    * ''ansible clients -a sensors -j'' 
-    * ''ansible-playbook check-temps.yml'' 
-      * <file yaml check-temps.yml> 
---- 
-- name: Check temperature 
-  hosts: clients 
-  remote_user: ansible 
-  become: true 
-  tasks: 
-    - name: Gather CPU temperature 
-      shell: "sensors | grep 'Package id 0:' | cut -c17-20" 
-      register: temp 
-      changed_when: false 
-    - name: Check CPU temperature 
-      fail: 
-        msg: "{{ temp.stdout }}" 
-      when: (temp.stdout | int > 80) 
-</file> 
- 
-See [[https://github.com/aisbergg/ansible-role-lm-sensors|this link]] for more information on using this sensors information with Ansible. 
lab/stack_of_nucs/ansible_playbook_-_fah_installation.1684462461.txt.gz · Last modified: 2023/05/19 02:14 by user