Rendeiro lab manual

A guide to the Rendeiro lab at CeMM

Abstract

You are reading version ‘6bcc79b’.

The Rendeiro Lab Manual serves as a comprehensive guide to the lab’s operations, culture, and best practices.

It encompasses essential protocols for maintaining an inclusive and respectful work environment, while fostering innovation and collaboration.

This document provides detailed instructions for onboarding new members, structuring research projects, keeping records, and utilizing modern computational tools and resources.

Key sections include guidelines for source code management, lab notebook maintenance, data organization, manuscript preparation, and project planning, emphasizing a shallow-pass strategy for iterative research execution.

With a strong focus on reproducibility and efficiency, the manual also outlines standards for effective communication, collaborative efforts, and the use of cutting-edge technologies in data analysis and visualization.

Designed to be a living resource, the manual ensures alignment with CeMM’s standards and the lab’s mission of advancing molecular medicine through robust and innovative science.

Lab manual

Welcome to the Rendeiro Lab Manual. This manual provides comprehensive information about the lab’s culture, procedures, and workflows to ensure a collaborative and efficient research environment.

The manual is hosted in the lab-manual repository on GitHub. It is written in Markdown and can be converted to HTML and PDF using Pandoc.

This manual is open source and maintained collaboratively. Anyone on GitHub can propose changes.

Building the manual

The project includes a Makefile to streamline the development process.

Key targets include:

Styling for the manual is controlled by a custom CSS file, which ensures a nice appearance in both HTML and PDF formats.

Editing content

To contribute: 1. Edit or create files directly on GitHub or locally on your system. 2. Submit a pull request with a clear, one-line description of the changes made. 3. Follow best practices by adding reviewers and referencing related issues, if applicable.

For adding a table of contents to any document, use mdformat-toc. Insert <!-- mdformat-toc start --> where the table of contents should appear, and run mdformat <file.md> on the edited file, or make format to format all.

Acknowledgements

We thank the following labs for sharing their open-source lab manuals, which inspired this project:

TODO

Getting started

Welcome to the Rendeiro lab at CeMM!

You are at the start of an adventure, don’t forget to enjoy the journey.

Get informed

The first step is to get informed about the lab’s mission and philosophy, the code of conduct and how we operate (rest of the manual).

Feel free to go ahead and read the mission statement and code of conduct to get started.

The rest of the content here are pointers for newcomers on practical aspects on how to get started in the lab.

Get started

Lab-specific

Onboarding to-dos (responsability of the PI):

  1. Ask the new team member for their contact information (optional, but recommended for safety reasons) - add your own information here;
  2. Add the new team member to lab Microsoft team and mailing list;
  3. Share the lab’s resources with the new team member:
  4. Make sure the new team member follow up with the remaining of CeMM onboarding (IT, safety, etc)

CeMM-specific

IT services

Please read https://cemmat.sharepoint.com/sites/IT-Resources for updated information.

Intranet

Consult the intranet for updated information on everything CeMM: https://cemmat.sharepoint.com/sites/Intranet

Mail

The default email addressess have the form <user>@cemm.oeaw.ac.at due to our relationship with the Austrian Academy of Sciences. The short form <user>@cemm.at can also receive emails and is mandatory to access/register Microsoft Office 365 cloud resources: OneDrive, Teams, Word, Excel, PowerPoint, etc.

Forwarding CeMM emails to external services is not allowed.

Shared drives and directories

Windows shares:

Login with your CeMM credentials and the domain name (if asked) is “cemmint”.

VPN
  1. Get a MUW ID by submitting a form to the CeMM IT team (https://cemmat.sharepoint.com/sites/IT-Resources).
  2. Get authorization from MUW to access VPN, submit the form to the CeMM IT team.
  3. Connect:

Mission and philosophy

Mission statement

At the Rendeiro Lab, we are committed to unraveling the complexities of human aging and pathology through the integration of computational innovation and molecular precision. As part of CeMM and the Ludwig Boltzmann Institute for Network Medicine, our mission is to decode the architectural patterns of the human body—from cellular microenvironments to organ-wide structures—and to understand how these patterns influence health, aging, and the onset of disease. We aim to uncover the mechanisms through which cellular and tissue-level changes contribute to the progressive decline in physiological function, and to leverage this knowledge to predict disease risk, facilitate early diagnosis, and inspire transformative therapeutic strategies.

Our work is grounded in the development and application of computational methods that analyze spatial data, including digital pathology, spatial transcriptomics, and highly multiplexed imaging. By integrating these high-dimensional data streams with molecular and clinical insights across the human lifespan, we strive to answer key questions: How do cellular alterations scale to tissue dysfunction? What are the molecular underpinnings of age-related diseases? How can we differentiate between normal aging and pathology? These questions guide our efforts to generate actionable insights that bridge the gap between fundamental research and practical applications in healthcare.

Through close collaboration with clinicians, pathologists, and researchers across disciplines, we ensure that our findings are validated in real-world contexts. We believe that understanding the intricate connections between cellular dynamics and organ-level architecture will lead not only to better interventions for age-associated diseases but also to a deeper appreciation of the processes that sustain human life. With a commitment to innovation, collaboration, and real-world impact, the Rendeiro Lab aims to shape the future of aging and pathology research for the benefit of society.

Philosophy

Our lab operates at the intersection of biomedicine, computational biology, and systems biology research, guided by these core principles:

Code of conduct

Our lab aims to be a safe, professional, and encouraging place where everybody feels respected, appreciated, and free to contribute equally. All lab members must abide by the code of conduct:

Be respectful

No forms of harassment or discrimination are tolerated. All individuals, regardless of their age, gender identity, racial or ethnical background, sexual orientation, religion, culture, academic record, personal background, disability status, economic status, or mental health status, shall be treated with equal respect and recognition.

Please follow the guidelines below:

Be professional

All members are expected to conduct themselves in a professional manner. This involves being honest, with integrity, accountability, and respectfulness to others.

Please follow the behaviours below:

Be proactive and inclusive

We can all be better if we help each other and do things for the good of the lab.

The following is encouraged:

This code of conduct was inspired by and adapts parts from the following:

Lab communication

In person meetings

Individual meetings

Lab meetings

Code review

Strategic collaborative projects (SCPs)

Hackatons

Yearly kick start/team building meeting

See also the “asking questions” section on the learning note.

Messaging

Use the lab’s mailing list (rendeirogroup@int.cemm.at) for searchable, archivable content (announcements, scheduling, papers).

Use Microsoft Teams for non-archival content (curiosities, non-urgent questions, fun stuff).

Use Signal’s lab group for quick messages (coordinating movement, real-time info, urgent questions, fun stuff).

Papers

Feel free to send around any paper you find interesting or relevant.

Send it via email (rendeirogroup@int.cemm.at) in the following format:

Fields with <> should be replaced with content.

Lab retreats

[!CAUTION] TODO

Social meetings and celebrations

Social gatherins for example in celebrations of personal and professional achievements are welcome and encouraged. They should however follow the guidelines and rules from CeMM.

Lab infrastructure

This document describes the infrastructure used in the lab, including CeMM-provided or our own infrastructure. So far it details only computational infrastructure.

Lab infrastructure

Hardware

VMs

Cloud resources

Read the documentation on using Azure web services here: azure.md

CeMM infrastructure

Refer to the CeMM Intranet documentation for updated information. Below are a few notes on things which are not covered there:

Printing from Linux

CeMM has Canon iR-ADV C5735/5740 printers. They support IPP printing through CUPS.

Install CUPS:

sudo apt-get install cups

Add printers:

sudo lpadmin -p CeMM_level_2 -E -v ipp://193.171.185.37/ipp -m everywhere
sudo lpadmin -p CeMM_level_3 -E -v ipp://193.171.185.212/ipp -m everywhere
sudo lpadmin -p CeMM_level_4 -E -v ipp://193.171.185.39/ipp -m everywhere
sudo lpadmin -p CeMM_level_5 -E -v ipp://193.171.185.40/ipp -m everywhere
sudo lpadmin -p CeMM_level_6 -E -v ipp://193.171.185.38/ipp -m everywhere
sudo lpadmin -p CeMM_level_7 -E -v ipp://193.171.185.41/ipp -m everywhere

Project planning

This guide can serve as a standalone reference for project planning and fellowship writing. Each section builds on conceptual, strategic, and practical aspects separately, but they should be taken into consideration together for best results.

Background and concepts

Ideal vs reality of designing a project for public funding

Triangle of knowledge

Strategies

Project design

Balancing feasibility and novelty

Balancing profiling and perturbing

The two strategies can be linked. For instance: Aim 1 is slightly incremental with profiling only; Aim 2: more ambitious/novel based on perturbations.

Shallow-pass strategy for project execution

The Shallow-pass strategy is a systematic approach to project execution that emphasizes early, rapid, and minimally viable progress through a project’s full timeline, followed by iterative deepening of depth (both conceptually and technically) as needed. It draws on established principles from Agile development, Rapid prototyping, Incremental research, and Lean startup approaches.

Core concept

The strategy envisions a 2D space where:

Instead of moving deeply through each individual task in sequence (e.g., perfecting each aim before moving to the next), the Shallow-pass strategy encourages a shallow, complete pass over all key components of the project first.

This approach creates a “minimum viable version” of the entire project, akin to a minimum viable product (MVP), and allows for early identification of bottlenecks, feasibility issues, and unknowns.

Once this shallow layer is complete, depth is added step-by-step to specific areas where gains are most needed or most valuable. This “depth-first” progression occurs only when justified by clear metrics or emerging insights.

Execution

  1. Initial pass (shallow path): The goal is to complete a simple, end-to-end version of the project with minimal depth. This quick, rough pass identifies critical barriers, risks, and time-sinks. For example, instead of training a deep learning model on a large dataset, you might start with a simple logistic regression on a small subset, or run a pilot experiment to validate feasibility.
  2. Iterative deepening (vertical progression): Revisit earlier steps and selectively add depth to areas that yield the most benefit. If an aspect works well at shallow depth, further refinement may be unnecessary. For example, after a successful logistic regression, you might increase depth by training a neural network, scaling up the dataset, or adding multi-modal inputs.
  3. When to stop (completion): Avoid the “infinite perfection” trap by setting success metrics (e.g., 90% model accuracy) at the outset. Once these criteria are met, stop. If your classifier achieves 95% accuracy when the goal was 90%, additional work may have little value. Completion is defined by sufficiency, not perfection.

Common pitfalls and how to avoid them

Pitfall How to avoid
Going too deep, too soon Focus on the shallow pass first. Move on even if the result isn’t “perfect.”
Perfectionism Set “success metrics” early. If you meet them, stop!
Sunk cost fallacy Shallow-pass shows which paths are dead ends. Pivot early.
Failure to prioritize Invest time in paths that provide the most “return on depth”.
Getting stuck at step 1 Even if step 1 is imperfect, keep moving to step 2.

Writing process

1. Planning and supervisor coordination

2. Content development

Start with the big picture

Clarity is everything

Is a hypothesis necessary?

3. Proposal structure

Top-down approach

  1. Hypothesis/Objectives/Goals
  2. Impact:
    • Why should this be addressed?
    • What impact will success have (on science, society, policy, etc.)?
    • What will we learn even if the project fails?
  3. Aims: Typically 2-3 aims.
  4. Tasks for each aim (1-2 per aim):
    • Goal of the task
    • Why it matters
    • Required resources, data, or inputs
    • Methods, experiments, and tools required
  5. Introduction (3-4 points):
    1. The problem being addressed.
    2. What has been tried previously.
    3. Key technologies, datasets, or resources now available to address the problem.
    4. A hint of the specific approach you will take.
  6. Challenges and mitigation
    • Think of the top challenges in the proposal design.
    • Separate them by conceptual and technical
    • Try to preemptively address them at design stage

4. Proposal writing process

Gradual, hierarchical writing

Figures & visual aids

5. Use of AI/LLMs (Large Language Models)

Dos and Don’ts

Why use LLMs?

6. Feedback process

Summary of key takeaways

  1. Designing a project: Balance feasibility and novelty. Balance profiling (description) and perturbation (causation).
  2. Executing a project: Use the “square triangle” approach — shallow, fast progress first; deeper, slower progress later.
  3. Writing a proposal: Work step-by-step, define key objectives first, and address the big picture.
  4. Clarity and structure are essential for success.
  5. Involve supervisors early in project design, figure development, and editing.
  6. Leverage LLMs for support, but control the process.
  7. Learn from feedback — it’s one of the most valuable aspects of the process.

Additional Resources

[!CAUTION] TODO

Research

Project life cycle

Initializing a project

  1. Register your project in the lab’s project register:
    1. Go to https://cemmat.sharepoint.com/sites/rendeirolab and find the ‘Lab project register.xlsx’ or directly at https://cemmat.sharepoint.com/:x:/r/sites/rendeirolab/_layouts/15/Doc.aspx?sourcedoc=%7B4c72f84b-f33b-4162-a5e8-f05556fdf66b%7D&action=editnew
    2. Start a new row for your project and increment the Project ID by one.
    3. Choose an intuitive name for the project, avoiding adding personal information e.g. collaborator names.
  2. Create a project directory structure from the lab’s template: cookiecutter gh:rendeirolab/_project_template
  3. Create a git repository on GitHub (https://github.com/rendeirolab) with the same name as the project, make a first commit and push to Github.
  4. Create a directory for the project in the CeMM cluster at: /research/groups/lab_rendeiro/projects/
    1. Create a directory inside called data to store raw data.
  5. Create a directory for the project in the CeMM cluster at: /nobackup/groups/lab_rendeiro/projects/ 2. Create a soft link between /research/.../data and /nobackup/.../data
  6. Create a cemm_metadata.json file in /research/groups/lab_rendeiro/projects/$PROJECT/
  7. Create a cemm_metadata.json file in /nobackup/groups/lab_rendeiro/projects/$PROJECT/

You can find various metadata JSON templates at /research/groups/lab_rendeiro/projects/_templates/.

Make sure to validate your JSON files, e.g.:

# pip install jsonvalidate
wget http://metadata.int.cemm.at/latest/cemm_metadata_schema.json
jsonvalidate -i cemm_metadata.json cemm_metadata_schema.json

Developing a project

[!CAUTION] TODO

Maintaining files for a project

Make sure to maintain your metadata JSON files, in line with the existing data on disk.

Reporting research

No powerpoint!

Publications

[!CAUTION] TODO

Authorship

[!CAUTION] TODO

Tools and technologies of standard use within the lab

File types of digital data

Below are the preferred types of technology to be used in the lab:

Engaging in new projects

[!CAUTION] TODO

Meta-science

Record keeping guidelines

This document serves as a comprehensive guide for maintaining and organizing records within the lab. It covers the management of source code, lab notebooks, shared files, and other key resources to ensure efficient and consistent documentation of lab activities. Effective record keeping is not only essential for scientific research by facilitating reproducibility, collaboration, and project management, but also to comply with requirements from funders and CeMM.

Source Code

The creation, updating, and maintenance of source code are conducted on GitHub: Rendeiro Lab GitHub. Each project is assigned its own GitHub repository, with its name registered in the Lab Project Registry.

Structuring Projects

Lab Notebooks

It is essential to maintain detailed, up-to-date records of daily activities. This includes:

Digital and Physical Notes

Shared OneDrive Folder

The official platform for file sharing is the CeMM-provided OneDrive, accessible here: Shared Documents.

Best Practices

Presentations

Figures

Fellowship Proposals

CeMM-specific requirements for projects

CeMM has specific guidelines for data management, in particular for projects that exist on the HPC cluster. Read more about it on the intranet.

In particular, each project should have a cemm_metadata.json file.

The research page also provides information on this.

Learning

Asking questions

It is important to have independent learning skills, for example finding information on your own. However, it is also important to engage others and ask questions when you are stuck. While not knowing something is okay, being unwilling to learn or asking questions without minimally trying to find the answer yourself is not acceptable.

When asking questions, be sure to:

Topics to learn

As a member of the lab you are responsible to be aware, understand, and eventually master the following topics:

Note that learning is a iterative and continuous process. You will need to revise and revise your knowledge in each topic over the time, each time getting a deeper understanding or a new perspective on it.

Literature to know

Reviews

Data types

Digital pathology

Specific literature

Genetics of tissue and organ shape

Target discovery, prioritization and drugs

Courses

Programming, Computer Science

Machine Learning / Deep Learning

Statistics

Bioinformatics

Genomics

Medicine / Anatomy / Histology

Aging

Other courses (paid):

Books

Tutorials

Software packages

These are some of the software packages we use often in the lab. You can more easily be aware of the direction of their development and new versions by subscribing to their releases on Github (bell sign -> custom -> releases).

Data science

Statistics / machine learning

Tensor-specific

Deep learning

Visualization

Data-specific

Imaging:

Survival data:

Computational Pathology:

Web

Productivity and logistics

Data repositories

Manuscript writing

This document provides practical guidelines for writing a manuscript. For insights on planning a project please see the project planning guide instead.

Manuscript writing should really be called “manuscript crafting” as it involves a lot more than text writing and formatting or figure making as it takes a lot of time, effort and skill to craft a good manuscript.

Figures

A crucial part of crafting a great manuscript is good communication of ideas through visual elements (figures), and their alignment with the text.

Here are Andre’s tips for figure making based on practice. This has changed a little over the time, but mostly coalesced on a fairly simple system.

Making plots (Python, matplotlib, seaborn):

Recommend settings for matplotlib

Inkscape: Export to SVG

import matplotlib.pyplot as plt

plt.rcParams['savefig.bbox'] = 'tight'  # To ensure that legends and elements outside the axes are included
plt.rcParams['savefig.dpi'] = 300  # To make sure any rasterized elements have good quality
plt.rcParams['savefig.transparent'] = True  # To remove the white background
plt.rcParams['svg.fonttype'] = 'none'  # To allow font to be editable in Inkscape
plt.rcParams['font.family'] = 'Arial'  # Use Arial as the font

Adobe Illustrator: Export to PDF

This is similar as above, except that we change the font settings,

import matplotlib.pyplot as plt

plt.rcParams['savefig.bbox'] = 'tight'
plt.rcParams['savefig.dpi'] = 300
plt.rcParams['savefig.transparent'] = True
plt.rcParams['pdf.fonttype'] = 42
plt.rcParams['ps.fonttype'] = 42
plt.rcParams['font.family'] = 'Arial'

Assemble plots manually into a figure:

Inkscape

Post assembly, automatic assembly and conversion of file types (bash, inkscape, minify, pdfunite):

Only in very rare cases it is worth it to have a full final figure generated as a whole.

Adobe Illustrator:

Open PDF in Adobe Illustrator

Congratulations! Now you have a clean plot ready for editing.

[!CAUTION] Removing clipping masks may affect elements, such as bars that extend offscreen.

Text

Manuscript text is usually written in Microsoft Word (docx) or LibreOffice Writer (odt). Latex is also supported, talk with Andre about starting with a LaTeX template.

Formatting

Use styles to format your text. Do not use whitespace (e.g. newlines and spaces) in the text to format the document.

Structure

File formats

Refer to research page for more details.

Offboarding

It is important to be aware well in advance of the procedure for offboarding and to communicate the expectations and requirements involved in it.

Procedure

It is the responsability of the PI to communicate, initiate, and verify the following steps:

  1. Review all lab-specific records (see the record keeping section:
    • Lab notebook
    • OneDrive folder
    • Github account
  2. Review Data archiving and cleanup:
    • HPC cluster
    • Hilde workstation
    • Windows shares:
      • Groupdata: smb://int.cemm.at/files/groupdata/
      • Labs: smb://int.cemm.at/files/labs/
      • Home Folder: smb://int.cemm.at/files/home/<username>
  3. Review all CeMM-specifc requirements in coordination with HR:
    • Data management requirements (if points above are well executed, then it should be taken care of)
    • E-mail
    • VPN access
  4. Ask the new team member for their CeMM-independent contact information
  5. Ask the new team member whether they consent to be listed in the lab website as alumni
  6. Remove team member from external organizations:
    • Github
    • Weights and Biases
    • HuggingFace
    • the lab code of conduct (see code of conduct)
    • the Github account (here)
    • lab infrastructure (such as Cytomine, VMs, etc: infrastructure)
  7. Remove team member accounts from lab-specific infrastructure:
    • Cytomine
    • Hilde workstation
  8. Remove team member from Microsoft team and mailing list

Version changes

Unreleased

2025-01-27

2025-01-07