The issue of table and index bloat due to failed inserts on unique constraints is well known and has been discussed in various articles across the internet. However, these discussions sometimes lack a clear, practical example with measurements to illustrate the impact. And despite the familiarity of this issue, we still frequently see this design pattern—or rather, anti-pattern—in real-world applications. Developers often rely on unique constraints to prevent duplicate values from being inserted into tables. While this approach is straightforward, versatile, and generally considered effective, in PostgreSQL, inserts that fail due to unique constraint violations unfortunately always lead to table and index bloat. And on high-traffic systems, this unnecessary bloat can significantly increase disk I/O and the frequency of autovacuum runs. In this article, we aim to highlight this problem once again and provide a straightforward example with measurements to illustrate it. We suggest simple improvement that can help mitigate this issue and reduce autovacuum workload and disk I/O.
Two Approaches to Duplicate Prevention
In PostgreSQL, there are two main ways to prevent duplicate values using unique constraints:
1. Standard Insert Command (INSERT INTO table)
The usual INSERT INTO table command attempts to insert data directly into the table. If the insert would result in a duplicate value, it fails with a “duplicate key value violates unique constraint” error. Since the command does not specify any duplicate checks, PostgreSQL internally immediately inserts the new row and only then begins updating indexes. When it encounters a unique index violation, it triggers the error and deletes the newly added row. The order of index updates is determined by their relation IDs, so the extent of index bloat depends on the order in which indexes were created. With repeated “unique constraint violation” errors, both the table and some indexes accumulate deleted records leading to bloat, and the resulting write operations increase disk I/O without achieving any useful outcome.
2. Conflict-Aware Insert (INSERT INTO table … ON CONFLICT DO NOTHING)
The INSERT INTO table ON CONFLICT DO NOTHING command behaves differently. Since it specifies that a conflict might occur, PostgreSQL first checks for potential duplicates before attempting to insert data. If a duplicate is found, PostgreSQL performs the specified action—in this case, “DO NOTHING”—and no error occurs. This clause was introduced in PostgreSQL 9.5, but some applications either still run on older PostgreSQL versions or retain legacy code when the database is upgraded. As a result, this conflict-handling option is often underutilized.
Testing Example
To be able to do testing we must start PostgreSQL with “autovacuum=off”. Otherwise with instance mostly idle, autovacuum will immediately process bloated objects and it would be unable to catch statistics. We create a simple testing example with multiple indexes:
CREATE TABLE IF NOT EXISTS test_unique_constraints( id serial primary key, unique_text_key text, unique_integer_key integer, some_other_bigint_column bigint, some_other_text_column text); CREATE INDEX test_unique_constraints_some_other_bigint_column_idx ON test_unique_constraints (some_other_bigint_column ); CREATE INDEX test_unique_constraints_some_other_text_column_idx ON test_unique_constraints (some_other_text_column ); CREATE INDEX test_unique_constraints_unique_text_key_unique_integer_key__idx ON test_unique_constraints (unique_text_key, unique_integer_key, some_other_bigint_column ); CREATE UNIQUE test_unique_constraints_unique_integer_key_idx INDEX ON test_unique_constraints (unique_text_key ); CREATE UNIQUE test_unique_constraints_unique_text_key_idx INDEX ON test_unique_constraints (unique_integer_key );
And now we populate this table with unique data:
DO $$ BEGIN FOR i IN 1..1000 LOOP INSERT INTO test_unique_constraints (unique_text_key, unique_integer_key, some_other_bigint_column, some_other_text_column) VALUES (i::text, i, i, i::text); END LOOP; END; $$;
In the second step, we use a simple Python script to connect to the database, attempt to insert conflicting data, and close the session after an error. First, it sends 10,000 INSERT statements that conflict with the “test_unique_constraints_unique_int_key_idx” index, then another 10,000 INSERTs conflicting with “test_unique_constraints_unique_text_key_idx”. The entire test is done in a few dozen seconds, after which we inspect all objects using the “pgstattuple” extension. The following query lists all objects in a single output:
WITH maintable AS (SELECT oid, relname FROM pg_class WHERE relname = 'test_unique_constraints') SELECT m.oid as relid, m.relname as relation, s.* FROM maintable m JOIN LATERAL (SELECT * FROM pgstattuple(m.oid)) s ON true UNION ALL SELECT i.indexrelid as relid, indexrelid::regclass::text as relation, s.* FROM pg_index i JOIN LATERAL (SELECT * FROM pgstattuple(i.indexrelid)) s ON true WHERE i.indrelid::regclass::text = 'test_unique_constraints' ORDER BY relid;
Observed Results
After running the whole test several times, we observe the following:
- The main table “test_unique_constraints” always has 1,000 live tuples, and 20,000 additional dead records, resulting in approx 85% of dead tuples in the table
- Index on primary key always shows 21,000 tuples, unaware that 20,000 of these records are marked as deleted in the main table.
- Other non unique indexes show different results in different runs, ranging between 3,000 and 21,000 records. Numbers depend on the distribution of values generated for underlying columns by the script. We tested both repeated and completely unique values. Repeated values resulted in less records in indexes, completely unique values led to full count of 21,000 records in these indexes.
- Unique indexes showed repeatedly tuple counts only between 1,000 and 1,400 in all tests. Unique index on the “unique_text_key” always shows some dead tuples in the output. Precise explanation of these numbers would require deeper inspection of these relations and code of the pgstattuple function, which is beyond scope of this article. But some small bloat is reported also here.
- Numbers reported by pgstattuple function raised questions about their accuracy, although documentation seems to lead to the conclusion that numbers should be precise on tuple level.
- Subsequent manual vacuum confirms 20,000 dead records in the main table and 54 pages removed from primary key index, and up to several dozens of pages removed from other indexes – different numbers in each run in dependency on total count of tuples in these relations as described above.
- Each failed insert also increments the Transaction ID and thus increases the database’s transaction age.
Here is one example output from the query shown above after the test run which used unique values for all columns. As we can see, bloat of non unique indexes due to failed inserts can be big.
relid | relation | table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent -------+-----------------------------------------------------------------+-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+-------------- 16418 | test_unique_constraints | 1269760 | 1000 | 51893 | 4.09 | 20000 | 1080000 | 85.06 | 5420 | 0.43 16424 | test_unique_constraints_pkey | 491520 | 21000 | 336000 | 68.36 | 0 | 0 | 0 | 51444 | 10.47 16426 | test_unique_constraints_some_other_bigint_column_idx | 581632 | 16396 | 326536 | 56.14 | 0 | 0 | 0 | 168732 | 29.01 16427 | test_unique_constraints_some_other_text_column_idx | 516096 | 16815 | 327176 | 63.39 | 0 | 0 | 0 | 101392 | 19.65 16428 | test_unique_constraints_unique_text_key_unique_integer_key__idx | 1015808 | 21000 | 584088 | 57.5 | 0 | 0 | 0 | 323548 | 31.85 16429 | test_unique_constraints_unique_text_key_idx | 57344 | 1263 | 20208 | 35.24 | 2 | 32 | 0.06 | 15360 | 26.79 16430 | test_unique_constraints_unique_integer_key_idx | 40960 | 1000 | 16000 | 39.06 | 0 | 0 | 0 | 4404 | 10.75 (7 rows)
In a second test, we modify the script to include the ON CONFLICT DO NOTHING clause in the INSERT command and repeat both tests. This time, inserts do not result in errors; instead, they simply return “INSERT 0 0”, indicating that no records were inserted. Inspection of the Transaction ID after this test shows only a minimal increase, caused by background processes. Attempts to insert conflicting data did not result in increase of Transaction ID (XID), as PostgreSQL started first only virtual transaction to check for conflicts, and because a conflict was found, it aborted the transaction without having assigned a new XID. The “pgstattuple” output confirms that all objects contain only live data, with no dead tuples this time.
Summary
As demonstrated, each failed insert bloats the underlying table and some indexes, and increases the Transaction ID because each failed insert occurs in a separate transaction. Consequently, autovacuum is forced to run more frequently, consuming valuable system resources. Therefore applications still relying solely on plain INSERT commands without ON CONFLICT conditions should consider reviewing this implementation. But as always, the final decision should be based on the specific conditions of each application.
Artificial Intelligence (AI) is often regarded as a groundbreaking innovation of the modern era, yet its roots extend much further back than many realize. In 1943, neuroscientist Warren McCulloch and logician Walter Pitts proposed the first computational model of a neuron. The term “Artificial Intelligence” was coined in 1956. The subsequent creation of the Perceptron in 1957, the first model of a neural network, and the expert system Dendral designed for chemical analysis demonstrated the potential of computers to process data and apply expert knowledge in specific domains. From the 1970s to the 1990s, expert systems proliferated. A pivotal moment for AI in the public eye came in 1997 when IBM’s chess-playing computer Deep Blue defeated chess world champion Garry Kasparov.
The new millennium brought a new era for AI, with the integration of rudimentary AI systems into everyday technology. Spam filters, recommendation systems, and search engines subtly shaped online user experiences. In 2006, deep learning emerged, marking the renaissance of neural networks. The landmark development came in 2017 with the introduction of Transformers, a neural network architecture that became the most important ingredient for the creation of Large Language Models (LLMs). Its key component, the attention mechanism, enables the model to discern relationships between words over long distances within a text. This mechanism assigns varying weights to words depending on their contextual importance, acknowledging that the same word can hold different meanings in different situations. However, modern AI, as we know it, was made possible mainly thanks to the availability of large datasets and powerful computational hardware. Without the vast resources of the internet and electronic libraries worldwide, modern AI would not have enough data to learn and evolve. And without modern performant GPUs, training AI would be a challenging task.
The LLM is a sophisticated, multilayer neural network comprising numerous interconnected nodes. These nodes are the micro-decision-makers that underpin the collective intelligence of the system. During its training phase, an LLM learns to balance myriad small, simple decisions, which, when combined, enable it to handle complex tasks. The intricacies of these internal decisions are typically opaque to us, as we are primarily interested in the model’s output. However, these complex neural networks can only process numbers, not raw text. Text must be tokenized into words or sub-words, standardized, and normalized — converted to lowercase, stripped of punctuation, etc. These tokens are then put into a dictionary and mapped to unique numerical values. Only this numerical representation of the text allows LLMs to learn the complex relationships between words, phrases, and concepts and the likelihood of certain words or phrases following one another. LLMs therefore process texts as huge numerical arrays without truly understanding the content. They lack a mental model of the world and operate solely on mathematical representations of word relationships and their probabilities. This focus on the answer with the highest probability is also the reason why LLMs can “hallucinate” plausible yet incorrect information or get stuck in response loops, regurgitating the same or similar answers repeatedly.
Based on the relationships between words learned from texts, LLMs also create vast webs of semantic associations that interconnect words. These associations form the backbone of an LLM’s ability to generate contextually appropriate and meaningful responses. When we provide a prompt to an LLM, we are not merely supplying words; we are activating a complex network of related concepts and ideas. Consider the word “apple”. This simple term can trigger a cascade of associated concepts such as “fruit,” “tree,” “food,” and even “technology” or “computer”. The activated associations depend on the context provided by the prompt and the prevalence of related concepts in the training data. The specificity of a prompt greatly affects the semantic associations an LLM considers. A vague prompt like “tell me about apples” may activate a wide array of diverse associations, ranging from horticultural information about apple trees to the nutritional value of the fruit or even cultural references like the tale of Snow White. An LLM will typically use the association with the highest occurrence in its training data when faced with such a broad prompt. For more targeted and relevant responses, it is crucial to craft focused prompts that incorporate specific technical jargon or references to particular disciplines. By doing so, the user can guide the LLM to activate a more precise subset of semantic associations, thereby narrowing the scope of the response to the desired area of expertise or inquiry.
LLMs have internal parameters that influence their creativity and determinism, such as “temperature”, “top-p”, “max length”, and various penalties. However, these are typically set to balanced defaults, and users should not modify them; otherwise, they could compromise the ability of LLMs to provide meaningful answers. Prompt engineering is therefore the primary method for guiding LLMs toward desired outputs. By crafting specific prompts, users can subtly direct the model’s responses, ensuring relevance and accuracy. The LLM derives a wealth of information from the prompt, determining not only semantic associations for the answer but also estimating its own role and the target audience’s knowledge level. By default, an LLM assumes the role of a helper and assistant, but it can adopt an expert’s voice if prompted. However, to elicit an expert-level response, one must not only set an expert role for the LLM but also specify that the inquirer is an expert as well. Otherwise, an LLM assumes an “average Joe” as the target audience by default. Therefore, even when asked to impersonate an expert role, an LLM may decide to simplify the language for the “average Joe” if the knowledge level of the target audience is not specified, which can result in a disappointing answer.
Consider two prompts for addressing a technical issue with PostgreSQL:
1. “Hi, what could cause delayed checkpoints in PostgreSQL?”
2. “Hi, we are both leading PostgreSQL experts investigating delayed checkpoints. The logs show checkpoints occasionally taking 3-5 times longer than expected. Let us analyze this step by step and identify probable causes.”
The depth of the responses will vary significantly, illustrating the importance of prompt specificity. The second prompt employs common prompting techniques, which we will explore in the following paragraphs. However, it is crucial to recognize the limitations of LLMs, particularly when dealing with expert-level knowledge, such as the issue of delayed checkpoints in our example. Depending on the AI model and the quality of its training data, users may receive either helpful or misleading answers. The quality and amount of training data representing the specific topic play a crucial role.
Highly specialized problems may be underrepresented in the training data, leading to overfitting or hallucinated responses. Overfitting occurs when an LLM focuses too closely on its training data and fails to generalize, providing answers that seem accurate but are contextually incorrect. In our PostgreSQL example, a hallucinated response might borrow facts from other databases (like MySQL or MS SQL) and adjust them to fit PostgreSQL terminology. Thus, the prompt itself is no guarantee of a high-quality answer—any AI-generated information must be carefully verified, which is a task that can be challenging for non-expert users.
With these limitations in mind, let us now delve deeper into prompting techniques. “Zero-shot prompting” is a baseline approach where the LLM operates without additional context or supplemental reference material, relying on its pre-trained knowledge and the prompt’s construction. By carefully activating the right semantic associations and setting the correct scope of attention, the output can be significantly improved. However, LLMs, much like humans, can benefit from examples. By providing reference material within the prompt, the model can learn patterns and structure its output accordingly. This technique is called “few-shot prompting”. The quality of the output is directly related to the quality and relevance of the reference material; hence, the adage “garbage in, garbage out” always applies.
For complex issues, “chain-of-thought” prompting can be particularly effective. This technique can significantly improve the quality of complicated answers because LLMs can struggle with long-distance dependencies in reasoning. Chain-of-thought prompting addresses this by instructing the model to break down the reasoning process into smaller, more manageable parts. It leads to more structured and comprehensible answers by focusing on better-defined sub-problems. In our PostgreSQL example prompt, the phrase “let’s analyze this step by step” instructs the LLM to divide the processing into a chain of smaller sub-problems. An evolution of this technique is the “tree of thoughts” technique. Here, the model not only breaks down the reasoning into parts but also creates a tree structure with parallel paths of reasoning. Each path is processed separately, allowing the model to converge on the most promising solution. This approach is particularly useful for complex problems requiring creative brainstorming. In our PostgreSQL example prompt, the phrase “let’s identify probable causes” instructs the LLM to discuss several possible pathways in the answer.
Of course, prompting techniques have their drawbacks. Few-shot prompting is limited by the number of tokens, which restricts the amount of information that can be included. Additionally, the model may ignore parts of excessively long prompts, especially the middle sections. Care must also be taken with the frequency of certain words in the reference material, as overlooked frequency can bias the model’s output. Chain-of-thought prompting can also lead to overfitted or “hallucinated” responses for some sub-problems, compromising the overall result.
Instructing the model to provide deterministic, factual responses is another prompting technique, vital for scientific and technical topics. Formulations like “answer using only reliable sources and cite those sources” or “provide an answer based on peer-reviewed scientific literature and cite the specific studies or articles you reference” can direct the model to base its responses on trustworthy sources. However, as already discussed, even with instructions to focus on factual information, the AI’s output must be verified to avoid falling into the trap of overfitted or hallucinated answers.
In conclusion, effective prompt engineering is a skill that combines creativity with strategic thinking, guiding the AI to deliver the most useful and accurate responses. Whether we are seeking simple explanations or delving into complex technical issues, the way we communicate with the AI always makes a difference in the quality of the response. However, we must always keep in mind that even the best prompt is no guarantee of a quality answer, and we must double-check received facts. The quality and amount of training data are paramount, and this means that some problems with received answers can persist even in future LLMs simply because they would have to use the same limited data for some specific topics.
When the model’s training data is sparse or ambiguous in certain highly focused areas, it can produce responses that are syntactically valid but factually incorrect. One reason AI hallucinations can be particularly problematic is their inherent plausibility. The generated text is usually grammatically correct and stylistically consistent, making it difficult for users to immediately identify inaccuracies without external verification. This highlights a key distinction between plausibility and veracity: just because something sounds right it does not mean it is true.
Whether the response is an insightful solution to a complex problem or completely fabricated nonsense is a distinction that must be made by human users, based on their expertise of the topic at hand. Our clients gained repeatedly exactly this experience with different LLMs. They tried to solve their technical problems using AI, but answers were partially incorrect or did not work at all. This is why human expert knowledge is still the most important factor when it comes to solving difficult technical issues. The inherent limitations of LLMs are unlikely to be fully overcome, at least not with current algorithms. Therefore expert knowledge will remain essential in delivering reliable, high-quality solutions even in the future. As people increasingly use AI tools in the same way they rely on Google — using them as a resource or assistant — true expertise will still be needed to interpret, refine, and implement these tools effectively. On the other hand, AI is emerging as a key driver of innovations. Progressive companies are investing heavily in AI, facing challenges related to security and performance. And this is the area where NetApp can also help. Its cloud AI focused solutions are designed to address exactly these issues.
(Picture generated by my colleague Felix Alipaz-Dicke using ChatGPT-4.)
Mastering Cloud Infrastructure with Pulumi: Introduction
In today’s rapidly changing landscape of cloud computing, managing infrastructure as code (IaC) has become essential for developers and IT professionals. Pulumi, an open-source IaC tool, brings a fresh perspective to the table by enabling infrastructure management using popular programming languages like JavaScript, TypeScript, Python, Go, and C#. This approach offers a unique blend of flexibility and power, allowing developers to leverage their existing coding skills to build, deploy, and manage cloud infrastructure. In this post, we’ll explore the world of Pulumi and see how it pairs with Amazon FSx for NetApp ONTAP—a robust solution for scalable and efficient cloud storage.
Pulumi – The Theory
Why Pulumi?
Pulumi distinguishes itself among IaC tools for several compelling reasons:
- Use Familiar Programming Languages: Unlike traditional IaC tools that rely on domain-specific languages (DSLs), Pulumi allows you to use familiar programming languages. This means no need to learn new syntax, and you can incorporate sophisticated logic, conditionals, and loops directly in your infrastructure code.
- Seamless Integration with Development Workflows: Pulumi integrates effortlessly with existing development workflows and tools, making it a natural fit for modern software projects. Whether you’re managing a simple web app or a complex, multi-cloud architecture, Pulumi provides the flexibility to scale without sacrificing ease of use.
Challenges with Pulumi
Like any tool, Pulumi comes with its own set of challenges:
- Learning Curve: While Pulumi leverages general-purpose languages, developers need to be proficient in the language they choose, such as Python or TypeScript. This can be a hurdle for those unfamiliar with these languages.
- Growing Ecosystem: As a relatively new tool, Pulumi’s ecosystem is still expanding. It might not yet match the extensive plugin libraries of older IaC tools, but its vibrant and rapidly growing community is a promising sign of things to come.
State Management in Pulumi: Ensuring Consistency Across Deployments
Effective infrastructure management hinges on proper state handling. Pulumi excels in this area by tracking the state of your infrastructure, enabling it to manage resources efficiently. This capability ensures that Pulumi knows exactly what needs to be created, updated, or deleted during deployments. Pulumi offers several options for state storage:
- Local State: Stored directly on your local file system. This option is ideal for individual projects or simple setups.
- Remote State: By default, Pulumi stores state remotely on the Pulumi Service (a cloud-hosted platform provided by Pulumi), but it also allows you to configure storage on AWS S3, Azure Blob Storage, or Google Cloud Storage. This is particularly useful in team environments where collaboration is essential.
Managing state effectively is crucial for maintaining consistency across deployments, especially in scenarios where multiple team members are working on the same infrastructure.
Other IaC Tools: Comparing Pulumi to Traditional IaC Tools
When comparing Pulumi to other Infrastructure as Code (IaC) tools, several drawbacks of traditional approaches become evident:
- Domain-Specific Language (DSL) Limitations: Many IaC tools depend on DSLs, such as Terraform’s HCL, requiring users to learn a specialized language specific to the tool.
- YAML/JSON Constraints: Tools that rely on YAML or JSON can be both restrictive and verbose, complicating the management of more complex configurations.
- Steep Learning Curve: The necessity to master DSLs or particular configuration formats adds to the learning curve, especially for newcomers to IaC.
- Limited Logical Capabilities: DSLs often lack support for advanced logic constructs such as loops, conditionals, and reusability. This limitation can lead to repetitive code that is challenging to maintain.
- Narrow Ecosystem: Some IaC tools have a smaller ecosystem, offering fewer plugins, modules, and community-driven resources.
- Challenges with Code Reusability: The inability to reuse code across different projects or components can hinder efficiency and scalability in infrastructure management.
- Testing Complexity: Testing infrastructure configurations written in DSLs can be challenging, making it difficult to ensure the reliability and robustness of the infrastructure code.
Pulumi – In Practice
Introduction
In the this section, we’ll dive into a practical example to better understand Pulumi’s capabilities. We’ll also explore how to set up a project using Pulumi with AWS and automate it using GitHub Actions for CI/CD.
Prerequisites
Before diving into using Pulumi with AWS and automating your infrastructure management through GitHub Actions, ensure you have the following prerequisites in place:
- Pulumi CLI: Begin by installing the Pulumi CLI by following the official installation instructions. After installation, verify that Pulumi is correctly set up and accessible in your system’s PATH by running a quick version check.
- AWS CLI: Install the AWS CLI, which is essential for interacting with AWS services. Configure the AWS CLI with your AWS credentials to ensure you have access to the necessary AWS resources. Ensure your AWS account is equipped with the required permissions, especially for IAM, EC2, S3, and any other AWS services you plan to manage with Pulumi.
- AWS IAM User/Role for GitHub Actions: Create a dedicated IAM user or role in AWS specifically for use in your GitHub Actions workflows. This user or role should have permissions necessary to manage the resources in your Pulumi stack. Store the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY securely as secrets in your GitHub repository.
- Pulumi Account: Set up a Pulumi account if you haven’t already. Generate a Pulumi access token and store it as a secret in your GitHub repository to facilitate secure automation.
- Python and Pip: Install Python (version 3.7 or higher is recommended) along with Pip, which are necessary for Pulumi’s Python SDK. Once Python is installed, proceed to install Pulumi’s Python SDK along with any required AWS packages to enable infrastructure management through Python.
- GitHub Account: Ensure you have an active GitHub account to host your code and manage your repository. Create a GitHub repository where you’ll store your Pulumi project and related automation workflows. Store critical secrets like AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and your Pulumi access token securely in the GitHub repository’s secrets section.
- GitHub Runners: Utilize GitHub-hosted runners to execute your GitHub Actions workflows, or set up self-hosted runners if your project requires them. Confirm that the runners have all necessary tools installed, including Pulumi, AWS CLI, Python, and any other dependencies your Pulumi project might need.
Project Structure
When working with Infrastructure as Code (IaC) using Pulumi, maintaining an organized project structure is essential. A clear and well-defined directory structure not only streamlines the development process but also improves collaboration and deployment efficiency. In this post, we’ll explore a typical directory structure for a Pulumi project and explain the significance of each component.
Overview of a Typical Pulumi Project Directory
A standard Pulumi project might be organized as follows:
/project-root
├── .github
│ └── workflows
│ └── workflow.yml # GitHub Actions workflow for CI/CD
├── __main__.py # Entry point for the Pulumi program
├── infra.py # Infrastructure code
├── pulumi.dev.yml # Pulumi configuration for the development environment
├── pulumi.prod.yml # Pulumi configuration for the production environment
├── pulumi.yml # Pulumi configuration (common or default settings)
├── requirements.txt # Python dependencies
└── test_infra.py # Tests for infrastructure code
NetApp FSx on AWS
Introduction
Amazon FSx for NetApp ONTAP offers a fully managed, scalable storage solution built on the NetApp ONTAP file system. It provides high-performance, highly available shared storage that seamlessly integrates with your AWS environment. Leveraging the advanced data management capabilities of ONTAP, FSx for NetApp ONTAP is ideal for applications needing robust storage features and compatibility with existing NetApp systems.
Key Features
- High Performance: FSx for ONTAP delivers low-latency storage designed to handle demanding, high-throughput workloads.
- Scalability: Capable of scaling to support petabytes of storage, making it suitable for both small and large-scale applications.
- Advanced Data Management: Leverages ONTAP’s comprehensive data management features, including snapshots, cloning, and disaster recovery.
- Multi-Protocol Access: Supports NFS and SMB protocols, providing flexible access options for a variety of clients.
- Cost-Effectiveness: Implements tiering policies to automatically move less frequently accessed data to lower-cost storage, helping optimize storage expenses.
What It’s About
In the next sections, we’ll walk through the specifics of setting up each component using Pulumi code, illustrating how to create a VPC, configure subnets, set up a security group, and deploy an FSx for NetApp ONTAP file system, all while leveraging the robust features provided by both Pulumi and AWS.
Architecture Overview
A visual representation of the architecture we’ll deploy using Pulumi: Single AZ Deployment with FSx and EC2
The diagram above illustrates the architecture for deploying an FSx for NetApp ONTAP file system within a single Availability Zone. The setup includes a VPC with public and private subnets, an Internet Gateway for outbound traffic, and a Security Group controlling access to the FSx file system and the EC2 instance. The EC2 instance is configured to mount the FSx volume using NFS, enabling seamless access to storage.
Setting up Pulumi
Follow these steps to set up Pulumi and integrate it with AWS:
Install Pulumi: Begin by installing Pulumi using the following command:
curl -fsSL https://get.pulumi.com | sh
Install AWS CLI: If you haven’t installed it yet, install the AWS CLI to manage AWS services:
pip install awscli
Configure AWS CLI: Configure the AWS CLI with your credentials:
aws configure
Create a New Pulumi Project: Initialize a new Pulumi project with AWS and Python:
pulumi new aws-python
Configure Your Pulumi Stack: Set the AWS region for your Pulumi stack:
pulumi config set aws:region eu-central-1
Deploy Your Stack: Deploy your infrastructure using Pulumi:
pulumi preview ; pulumi up
Example: VPC, Subnets, and FSx for NetApp ONTAP
Let’s dive into an example Pulumi project that sets up a Virtual Private Cloud (VPC), subnets, a security group, an Amazon FSx for NetApp ONTAP file system, and an EC2 instance.
Pulumi Code Example: VPC, Subnets, and FSx for NetApp ONTAP
The first step is to define all the parameters required to set up the infrastructure. You can use the following example to configure these parameters as specified in the pulumi.dev.yaml file.
This pulumi.dev.yaml file contains configuration settings for a Pulumi project. It specifies various parameters for the deployment environment, including the AWS region, availability zones, and key name. It also defines CIDR blocks for subnets. These settings are used to configure and deploy cloud infrastructure resources in the specified AWS region.
config:
aws:region: eu-central-1
demo:availabilityZone: eu-central-1a
demo:keyName: XYZ
demo:subnet1CIDER: 10.0.3.0/24
demo:subnet2CIDER: 10.0.4.0/24
The following code snippet should be placed in the infra.py file. It details the setup of the VPC, subnets, security group, and FSx for NetApp ONTAP file system. Each step in the code is explained through inline comments.
import pulumi import pulumi_aws as aws import pulumi_command as command import os # Retrieve configuration values from Pulumi configuration files aws_config = pulumi.Config("aws") region = aws_config.require("region") # The AWS region where resources will be deployed demo_config = pulumi.Config("demo") availability_zone = demo_config.require("availabilityZone") # Availability Zone for the deployment subnet1_cidr = demo_config.require("subnet1CIDER") # CIDR block for the public subnet subnet2_cidr = demo_config.require("subnet2CIDER") # CIDR block for the private subnet key_name = demo_config.require("keyName") # Name of the SSH key pair for EC2 instance access# Create a new VPC with DNS support enabled vpc = aws.ec2.Vpc( "fsxVpc", cidr_block="10.0.0.0/16", # VPC CIDR block enable_dns_support=True, # Enable DNS support in the VPC enable_dns_hostnames=True # Enable DNS hostnames in the VPC ) # Create an Internet Gateway to allow internet access from the VPC internet_gateway = aws.ec2.InternetGateway( "vpcInternetGateway", vpc_id=vpc.id # Attach the Internet Gateway to the VPC ) # Create a public route table for routing internet traffic via the Internet Gateway public_route_table = aws.ec2.RouteTable( "publicRouteTable", vpc_id=vpc.id, routes=[aws.ec2.RouteTableRouteArgs( cidr_block="0.0.0.0/0", # Route all traffic (0.0.0.0/0) to the Internet Gateway gateway_id=internet_gateway.id )] ) # Create a single public subnet in the specified Availability Zone public_subnet = aws.ec2.Subnet( "publicSubnet", vpc_id=vpc.id, cidr_block=subnet1_cidr, # CIDR block for the public subnet availability_zone=availability_zone, # The specified Availability Zone map_public_ip_on_launch=True # Assign public IPs to instances launched in this subnet ) # Create a single private subnet in the same Availability Zone private_subnet = aws.ec2.Subnet( "privateSubnet", vpc_id=vpc.id, cidr_block=subnet2_cidr, # CIDR block for the private subnet availability_zone=availability_zone # The same Availability Zone ) # Associate the public subnet with the public route table to enable internet access public_route_table_association = aws.ec2.RouteTableAssociation( "publicRouteTableAssociation", subnet_id=public_subnet.id, route_table_id=public_route_table.id ) # Create a security group to control inbound and outbound traffic for the FSx file system security_group = aws.ec2.SecurityGroup( "fsxSecurityGroup", vpc_id=vpc.id, description="Allow NFS traffic", # Description of the security group ingress=[ aws.ec2.SecurityGroupIngressArgs( protocol="tcp", from_port=2049, # NFS protocol port to_port=2049, cidr_blocks=["0.0.0.0/0"] # Allow NFS traffic from anywhere ), aws.ec2.SecurityGroupIngressArgs( protocol="tcp", from_port=111, # RPCBind port for NFS to_port=111, cidr_blocks=["0.0.0.0/0"] # Allow RPCBind traffic from anywhere ), aws.ec2.SecurityGroupIngressArgs( protocol="udp", from_port=111, # RPCBind port for NFS over UDP to_port=111, cidr_blocks=["0.0.0.0/0"] # Allow RPCBind traffic over UDP from anywhere ), aws.ec2.SecurityGroupIngressArgs( protocol="tcp", from_port=22, # SSH port for EC2 instance access to_port=22, cidr_blocks=["0.0.0.0/0"] # Allow SSH traffic from anywhere ) ], egress=[ aws.ec2.SecurityGroupEgressArgs( protocol="-1", # Allow all outbound traffic from_port=0, to_port=0, cidr_blocks=["0.0.0.0/0"] # Allow all outbound traffic to anywhere ) ] ) # Create the FSx for NetApp ONTAP file system in the private subnet file_system = aws.fsx.OntapFileSystem( "fsxFileSystem", subnet_ids=[private_subnet.id], # Deploy the FSx file system in the private subnet preferred_subnet_id=private_subnet.id, # Preferred subnet for the FSx file system security_group_ids=[security_group.id], # Attach the security group to the FSx file system deployment_type="SINGLE_AZ_1", # Single Availability Zone deployment throughput_capacity=128, # Throughput capacity in MB/s storage_capacity=1024 # Storage capacity in GB ) # Create a Storage Virtual Machine (SVM) within the FSx file system storage_virtual_machine = aws.fsx.OntapStorageVirtualMachine( "storageVirtualMachine", file_system_id=file_system.id, # Associate the SVM with the FSx file system name="svm1", # Name of the SVM root_volume_security_style="UNIX" # Security style for the root volume ) # Create a volume within the Storage Virtual Machine (SVM) volume = aws.fsx.OntapVolume( "fsxVolume", storage_virtual_machine_id=storage_virtual_machine.id, # Associate the volume with the SVM name="vol1", # Name of the volume junction_path="/vol1", # Junction path for mounting size_in_megabytes=10240, # Size of the volume in MB storage_efficiency_enabled=True, # Enable storage efficiency features tiering_policy=aws.fsx.OntapVolumeTieringPolicyArgs( name="SNAPSHOT_ONLY" # Tiering policy for the volume ), security_style="UNIX" # Security style for the volume ) # Extract the DNS name from the list of SVM endpoints dns_name = storage_virtual_machine.endpoints.apply(lambda e: e[0]['nfs'][0]['dns_name']) # Get the latest Amazon Linux 2 AMI for the EC2 instance ami = aws.ec2.get_ami( most_recent=True, owners=["amazon"], filters=[{"name": "name", "values": ["amzn2-ami-hvm-*-x86_64-gp2"]}] # Filter for Amazon Linux 2 AMI ) # Create an EC2 instance in the public subnet ec2_instance = aws.ec2.Instance( "fsxEc2Instance", instance_type="t3.micro", # Instance type for the EC2 instance vpc_security_group_ids=[security_group.id], # Attach the security group to the EC2 instance subnet_id=public_subnet.id, # Deploy the EC2 instance in the public subnet ami=ami.id, # Use the latest Amazon Linux 2 AMI key_name=key_name, # SSH key pair for accessing the EC2 instance tags={"Name": "FSx EC2 Instance"} # Tag for the EC2 instance ) # User data script to install NFS client and mount the FSx volume on the EC2 instance user_data_script = dns_name.apply(lambda dns: f"""#!/bin/bash sudo yum update -y sudo yum install -y nfs-utils sudo mkdir -p /mnt/fsx if ! mountpoint -q /mnt/fsx; then sudo mount -t nfs {dns}:/vol1 /mnt/fsx fi """) # Retrieve the private key for SSH access from environment variables while running with Github Actions private_key_content = os.getenv("PRIVATE_KEY") print(private_key_content) # Ensure the FSx file system is available before executing the script on the EC2 instance pulumi.Output.all(file_system.id, ec2_instance.public_ip).apply(lambda args: command.remote.Command( "mountFsxFileSystem", connection=command.remote.ConnectionArgs( host=args[1], user="ec2-user", private_key=private_key_content ), create=user_data_script, opts=pulumi.ResourceOptions(depends_on=[volume]) ))
Pytest with Pulumi
# Importing necessary libraries
import pulumi
import pulumi_aws as aws
from typing import Any, Dict, List
# Setting up configuration values for AWS region and various parameters
pulumi.runtime.set_config('aws:region', 'eu-central-1')
pulumi.runtime.set_config('demo:availabilityZone1', 'eu-central-1a')
pulumi.runtime.set_config('demo:availabilityZone2', 'eu-central-1b')
pulumi.runtime.set_config('demo:subnet1CIDER', '10.0.3.0/24')
pulumi.runtime.set_config('demo:subnet2CIDER', '10.0.4.0/24')
pulumi.runtime.set_config('demo:keyName', 'XYZ') - Change based on your own key
# Creating a class MyMocks to mock Pulumi's resources for testing
class MyMocks(pulumi.runtime.Mocks):
def new_resource(self, args: pulumi.runtime.MockResourceArgs) -> List[Any]:
# Initialize outputs with the resource's inputs
outputs = args.inputs
# Mocking specific resources based on their type
if args.typ == "aws:ec2/instance:Instance":
# Mocking an EC2 instance with some default values
outputs = {
**args.inputs, # Start with the given inputs
"ami": "ami-0eb1f3cdeeb8eed2a", # Mock AMI ID
"availability_zone": "eu-central-1a", # Mock availability zone
"publicIp": "203.0.113.12", # Mock public IP
"publicDns": "ec2-203-0-113-12.compute-1.amazonaws.com", # Mock public DNS
"user_data": "mock user data script", # Mock user data
"tags": {"Name": "test"} # Mock tags
}
elif args.typ == "aws:ec2/securityGroup:SecurityGroup":
# Mocking a Security Group with default ingress rules
outputs = {
**args.inputs,
"ingress": [
{"from_port": 80, "cidr_blocks": ["0.0.0.0/0"]}, # Allow HTTP traffic from anywhere
{"from_port": 22, "cidr_blocks": ["192.168.0.0/16"]} # Allow SSH traffic from a specific CIDR block
]
}
# Returning a mocked resource ID and the output values
return [args.name + '_id', outputs]
def call(self, args: pulumi.runtime.MockCallArgs) -> Dict[str, Any]:
# Mocking a call to get an AMI
if args.token == "aws:ec2/getAmi:getAmi":
return {
"architecture": "x86_64", # Mock architecture
"id": "ami-0eb1f3cdeeb8eed2a", # Mock AMI ID
}
# Return an empty dictionary if no specific mock is needed
return {}
# Setting the custom mocks for Pulumi
pulumi.runtime.set_mocks(MyMocks())
# Import the infrastructure to be tested
import infra
# Define a test function to validate the AMI ID of the EC2 instance
@pulumi.runtime.test
def test_instance_ami():
def check_ami(ami_id: str) -> None:
print(f"AMI ID received: {ami_id}")
# Assertion to ensure the AMI ID is the expected one
assert ami_id == "ami-0eb1f3cdeeb8eed2a", 'EC2 instance must have the correct AMI ID'
# Running the test to check the AMI ID
pulumi.runtime.run_in_stack(lambda: infra.ec2_instance.ami.apply(check_ami))
# Define a test function to validate the availability zone of the EC2 instance
@pulumi.runtime.test
def test_instance_az():
def check_az(availability_zone: str) -> None:
print(f"Availability Zone received: {availability_zone}")
# Assertion to ensure the instance is in the correct availability zone
assert availability_zone == "eu-central-1a", 'EC2 instance must be in the correct availability zone'
# Running the test to check the availability zone
pulumi.runtime.run_in_stack(lambda: infra.ec2_instance.availability_zone.apply(check_az))
# Define a test function to validate the tags of the EC2 instance
@pulumi.runtime.test
def test_instance_tags():
def check_tags(tags: Dict[str, Any]) -> None:
print(f"Tags received: {tags}")
# Assertions to ensure the instance has tags and a 'Name' tag
assert tags, 'EC2 instance must have tags'
assert 'Name' in tags, 'EC2 instance must have a Name tag'
# Running the test to check the tags
pulumi.runtime.run_in_stack(lambda: infra.ec2_instance.tags.apply(check_tags))
# Define a test function to validate the user data script of the EC2 instance
@pulumi.runtime.test
def test_instance_userdata():
def check_user_data(user_data_script: str) -> None:
print(f"User data received: {user_data_script}")
# Assertion to ensure the instance has user data configured
assert user_data_script is not None, 'EC2 instance must have user_data_script configured'
# Running the test to check the user data script
pulumi.runtime.run_in_stack(lambda: infra.ec2_instance.user_data.apply(check_user_data))
Github Actions
Introduction
GitHub Actions is a powerful automation tool integrated within GitHub, enabling developers to automate their workflows, including testing, building, and deploying code. Pulumi, on the other hand, is an Infrastructure as Code (IaC) tool that allows you to manage cloud resources using familiar programming languages. In this post, we’ll explore why you should use GitHub Actions and its specific purpose when combined with Pulumi.
Why Use GitHub Actions and Its Importance
GitHub Actions is a powerful tool for automating workflows within your GitHub repository, offering several key benefits, especially when combined with Pulumi:
- Integrated CI/CD: GitHub Actions seamlessly integrates Continuous Integration and Continuous Deployment (CI/CD) directly into your GitHub repository. This automation enhances consistency in testing, building, and deploying code, reducing the risk of manual errors.
- Custom Workflows: It allows you to create custom workflows for different stages of your software development lifecycle, such as code linting, running unit tests, or managing complex deployment processes. This flexibility ensures your automation aligns with your specific needs.
- Event-Driven Automation: You can trigger GitHub Actions with events like pushes, pull requests, or issue creation. This event-driven approach ensures that tasks are automated precisely when needed, streamlining your workflow.
- Reusable Code: GitHub Actions supports reusable “actions” that can be shared across multiple workflows or repositories. This promotes code reuse and maintains consistency in automation processes.
- Built-in Marketplace: The GitHub Marketplace offers a wide range of pre-built actions from the community, making it easy to integrate third-party services or implement common tasks without writing custom code.
- Enhanced Collaboration: By using GitHub’s pull request and review workflows, teams can discuss and approve changes before deployment. This process reduces risks and improves collaboration on infrastructure changes.
- Automated Deployment: GitHub Actions automates the deployment of infrastructure code, using Pulumi to apply changes. This automation reduces the risk of manual errors and ensures a consistent deployment process.
- Testing: Running tests before deploying with GitHub Actions helps confirm that your infrastructure code works correctly, catching potential issues early and ensuring stability.
- Configuration Management: It manages and sets up necessary configurations for Pulumi and AWS, ensuring your environment is correctly configured for deployments.
- Preview and Apply Changes: GitHub Actions allows you to preview changes before applying them, helping you understand the impact of modifications and minimizing the risk of unintended changes.
- Cleanup: You can optionally destroy the stack after testing or deployment, helping control costs and maintain a clean environment.
Execution
To execute the GitHub Actions workflow:
- Placement: Save the workflow YAML file in your repository’s .github/workflows directory. This setup ensures that GitHub Actions will automatically detect and execute the workflow whenever there’s a push to the main branch of your repository.
- Workflow Actions: The workflow file performs several critical actions:
- Environment Setup: Configures the necessary environment for running the workflow.
- Dependency Installation: Installs the required dependencies, including Pulumi CLI and other Python packages.
- Testing: Runs your tests to verify that your infrastructure code functions as expected.
- Preview and Apply Changes: Uses Pulumi to preview and apply any changes to your infrastructure.
- Cleanup: Optionally destroys the stack after tests or deployment to manage costs and maintain a clean environment.
By incorporating this workflow, you ensure that your Pulumi infrastructure is continuously integrated and deployed with proper validation, significantly improving the reliability and efficiency of your infrastructure management process.
Example: Deploy infrastructure with Pulumi
name: Pulumi Deployment
on:
push:
branches:
- main
env:
# Environment variables for AWS credentials and private key.
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: ${{ secrets.AWS_DEFAULT_REGION }}
PRIVATE_KEY: ${{ secrets.PRIVATE_KEY }}
jobs:
pulumi-deploy:
runs-on: ubuntu-latest
environment: dev
steps:
- name: Checkout code
uses: actions/checkout@v3
# Check out the repository code to the runner.
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v3
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: eu-central-1
# Set up AWS credentials for use in subsequent actions.
- name: Set up SSH key
run: |
mkdir -p ~/.ssh
echo "${{ secrets.SSH_PRIVATE_KEY }}" > ~/.ssh/XYZ.pem
chmod 600 ~/.ssh/XYZ.pem
# Create an SSH directory, add the private SSH key, and set permissions.
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.9'
# Set up Python 3.9 environment for running Python-based tasks.
- name: Set up Node.js
uses: actions/setup-node@v3
with:
node-version: '14'
# Set up Node.js 14 environment for running Node.js-based tasks.
- name: Install project dependencies
run: npm install
working-directory: .
# Install Node.js project dependencies specified in `package.json`.
- name: Install Pulumi
run: npm install -g pulumi
# Install the Pulumi CLI globally.
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
working-directory: .
# Upgrade pip and install Python dependencies from `requirements.txt`.
- name: Login to Pulumi
run: pulumi login
env:
PULUMI_ACCESS_TOKEN: ${{ secrets.PULUMI_ACCESS_TOKEN }}
# Log in to Pulumi using the access token stored in secrets.
- name: Set Pulumi configuration for tests
run: pulumi config set aws:region eu-central-1 --stack dev
# Set Pulumi configuration to specify AWS region for the `dev` stack.
- name: Pulumi stack select
run: pulumi stack select dev
working-directory: .
# Select the `dev` stack for Pulumi operations.
- name: Run tests
run: |
pulumi config set aws:region eu-central-1
pytest
working-directory: .
# Set AWS region configuration and run tests using pytest.
- name: Preview Pulumi changes
run: pulumi preview --stack dev
working-directory: .
# Preview the changes that Pulumi will apply to the `dev` stack.
- name: Update Pulumi stack
run: pulumi up --yes --stack dev
working-directory: .
# Apply the changes to the `dev` stack with Pulumi.
- name: Pulumi stack output
run: pulumi stack output
working-directory: .
# Retrieve and display outputs from the Pulumi stack.
- name: Cleanup Pulumi stack
run: pulumi destroy --yes --stack dev
working-directory: .
# Destroy the `dev` stack to clean up resources.
- name: Pulumi stack output (after destroy)
run: pulumi stack output
working-directory: .
# Retrieve and display outputs from the Pulumi stack after destruction.
- name: Logout from Pulumi
run: pulumi logout
# Log out from the Pulumi session.
Output:
DebConf 2024 from 28. July to 4. Aug 2024 https://debconf24.debconf.org/
Last week the annual Debian Community Conference DebConf happend in Busan, South Korea. Four NetApp employees (Michael, Andrew, Christop and Noël) participated the whole week at the Pukyong National University. The camp takes place before the conference, where the infrastructure is set up and the first collaborations take place. The camp is described in a separate article: https://www.credativ.de/en/blog/credativ-inside/debcamp-bootstrap-for-debconf24/
There was a heat wave with high humidity in Korea at the time but the venue and accommodation at the University are air conditioned so collaboration work, talks and BoF were possible under the circumstances.
Around 400 Debian enthusiasts from all over the world were onsite and additional people attended remotly with the video streaming and the Matrix online chat #debconf:matrix.debian.social
The content team created a schedule with different aspects of Debian; technical, social, political,….
https://debconf24.debconf.org/schedule/
There were two bigger announcements during DebConf24:
- the new distribution eLxr https://elxr.org/ based on Debian initiated by Windriver
https://debconf24.debconf.org/talks/138-a-unified-approach-for-intelligent-deployments-at-the-edge/
Two takeaway points I understood from this talk is Windriver wants to exchange CentOS and preferes a binary distribution. - The Debian package management system will get a new solver https://debconf24.debconf.org/talks/8-the-new-apt-solver/
The list of interesting talks is much longer from a full conference week. Most talks and BoF were streamed live and the recordings can be found in the video archive:
https://meetings-archive.debian.net/pub/debian-meetings/2024/DebConf24/
It is a tradtion to have a Daytrip for socializing and get a more interesting view of the city and the country. https://wiki.debian.org/DebConf/24/DayTrip/ (sorry the details of the three Daytrip are on the website for participants).
For the annual conference group photo we have to go outsite into the heat with high humidity but I hope you will not see us sweeting.
The Debian Conference 2025 will be in July in Brest, France: https://wiki.debian.org/DebConf/25/ and we will be there.:) Maybe it will be a chance for you to join us.
See also Debian News: DebConf24 closes in Busan and DebConf25 dates announced
DebConf24 https://debconf24.debconf.org/ took place from 2024-07-28 to 2024-08–04 in Busan, Korea.
Four employees (three Debian developers) from NetApp had the opportunity to participate in the annual event, which is the most important conference in the Debian world: Christoph Senkel, Andrew Lee, Michael Meskes and Noël Köthe.
DebCamp
What is DebCamp? DebCamp usually takes place a week before DebConf begins. For participants, DebCamp is a hacking session that takes place just before DebConf. It’s a week dedicated to Debian contributors focusing on their Debian-related projects, tasks, or problems without interruptions.
DebCamps are largely self-organized since it’s a time for people to work. Some prefer to work individually, while others participate in or organize sprints. Both approaches are encouraged, although it’s recommended to plan your DebCamp week in advance.
During this DebCamp, there are the following public sprints:
Python Team Sprint: QA work on the Python Team’s packages
l10n-pt-br Team Sprint: pt-br translation
Security Tools Packaging Team Sprint: QA work on the pkg-security Team’s packages
Ruby Team Sprint: Work on the transition to Ruby 3.3
Go Team Sprint: Get newer versions of docker.io, containerd, and podman into unstable/testing
Ftpmaster Team Sprint: discuss potential changes in ftpmaster team, workflow and communication
DebConf24 Boot Camp: guide people new to debian with a focus on debian packaging
LXQt Team Sprint: Workshop for new commers and work on the latest upstream release based on Qt6 and wayland support.
Scheduled workshops include:
GPG Workshop for Newcomers:
Asymmetric cryptography is a daily tool in Debian operations, used to establish trust and secure communications through email encryption, package signing, and more. This workshop participants will learn to create a PGP key and perform essential tasks such as file encryption/decryption, content signing, and sending encrypted emails. Post-creation, the key will be uploaded to public keyservers, enabling attendees to participate in our Continuous Keysigning Party.
Creating Web Galleries with Geo-Tagged Photos:
Learn how to create a web gallery with integrated maps from a geo-tagged photo collection. The session will cover the use of fgallery, openlayers, and a custom Python script, all orchestrated by a Makefile. This method, used for a South Korea gallery in 2018, will be taught hands-on, empowering others to showcase their photo collections similarly.
Introduction to Creating .deb Files (Debian Packaging):
This session will delve into the basics of Debian packaging and the Debian release cycle, including stable, unstable, and testing branches. Attendees will set up a Debian unstable system, build existing packages from source, and learn to create a Debian package from scratch. Discussions will extend online at #debconf24-bootcamp on irc.oftc.net.
In addition to the organizational part, our colleague Andrew is part of the orga team this year. He suported to arrange Cheese and Wine party and proposed an idea to organize a “Coffee Lab” where people can bring their coffee equipments and beans from their country and share each other during the conference. Andrew successfully set up the Coffee Lab in the social space with support from the “Local Team” and contributors Kitt, Clement, and Steven. They provided a diverse selection of beans and teas from countries such as Colombia, Ethiopia, India, Peru, Taiwan, Thailand, and Guatemala. Additionally, they shared various coffee-making tools, including the “Mr. Clever Dripper,” AeroPress, and AerSpeed grinder.
It also allows the DebConf committee to work together with the local team to prepare additional details for the conference. During DebCamp, the organization team typically handles the following tasks:
Setting up the Frontdesk: This involves providing conference badges (with maps and additional information) and distributing SWAG such as food vouchers, conference t-shirts, conference cups, usb-powered fan, and sponsor gifts.
Setting up the network: This includes configuring the network in conference rooms, hack labs, and video team equipment for live streaming during the event.
Accommodation arrangements: Assigning rooms for participants to check in to on-site accommodations.
Food arrangements: Catering to various dietary requirements, including regular, vegetarian, vegan, and accommodating special religious and allergy-related needs.
Setting up a spcial space: Providing a relaxed environment for participants to socialize and get to know each other.
Writing daily announcements: Keeping participants informed about ongoing activities.
Arranging childcare service.
Organizing day trip options.
Arranging parties.
In addition to the organizational part, our colleague Andrew also attended and arranged private sprints during DebCamp and contiune through DebConf via his LXQt team BoF and LXQt team newcommer private workshop. Where the team received contribution from new commers. The youngest one is only 13 years old who created his first GPG key during the GPG key workshop and attended LXQt team workshop where he managed to fix a few bugs in Debian during the workshop session.
Young kids in DebCamp
At DebCamp, two young attendees, aged 13 and 10, participated in a GPG workshop for newcomers and created their own GPG keys. The older child hastily signed another new attendee’s key without proper verification, not fully grasping that Debian’s security relies on the trustworthiness of GPG keys. This prompted a lesson from his Debian Developer father, who explained the importance of trust by comparing it to entrusting someone with the keys to one’s home. Realizing his mistake, the child considered how to rectify the situation since he had already signed and uploaded the key. He concluded that he could revoke the old key and create a new one after DebConf, which he did, securing his new GPG and SSH keys with a Yubikey.
How and when to use Software-Defined Networks in Proxmox VE
What is Software-Defined Networking?
How to configure a SDN
Knowing the basics and possibilities of Software-Defined Networking (SDN) now, it gets interesting to set up such a network within a Proxmox cluster.
Proxmox comes with support for software-defined networking (SDN), allowing users to integrate various types of network configurations to suit their specific networking needs. With Proxmox, you have the flexibility to select from several SDN types, including “Simple”, which is likely aimed at straightforward networking setups without the need for advanced features. For environments requiring network segmentation, VLAN support is available, providing the means to isolate and manage traffic within distinct virtual LANs. More complex scenarios might benefit from QinQ support, which allows multiple VLAN tags on a single interface. Also and very interesting for data centers, Proxmox also includes VxLAN support, which extends layer 2 networking over a layer 3 infrastructure which significantly increases the number of possible VLANs which would else be limited to 4096 VLANs. Lastly to mention is the EVPN support which is also part of Proxmox’s SDN offerings, facilitating advanced layer 2 and layer 3 virtualization and providing a scalable control plane with BGP (Border Gateway Protocol) for multi-tenancy environments.
In this guide, we’ll walk through the process of setting up a streamlined Software-Defined Network (SDN) within a Proxmox Cluster environment. The primary goal is to establish a new network, including its own network configuration that is automatically propagated across all nodes within the cluster. This newly created network will created by its own IP space where virtual machines (VMs) receiving their IP addresses dynamically via DHCP. This setup eliminates the need for manual IP forwarding or Network Address Translation (NAT) on the host machines. An additional advantage of this configuration is the consistency it offers; the gateway for the VMs will always remain constant regardless of the specific host node they are operating on.
Configuration
The configuration of Software-Defined Networking (SDN) got very easy within the latest Proxmox VE versions where the whole process can be done in the Proxmox web UI. Therefore, we just connect to the Proxmox management web interface which typically reachable at:
- https://HOSTNAME:8006
The SDN options are integrated within the datacenter chapter, in the sub chapter SDN. All further work will only be done within this chapter. Therefore, we navigate to:
–> Datacenter
—-> SDN
——–> Zones
The menu on the right site offers to add a new zone where the new zone of the type Simple will be selected. A new windows pops up where we directly activate the advanced options at the bottom. Afterwards, further required details will be provided.
ID: devnet01
MTU: Auto
Nodes: All
IPAM: pve
Automatic DHCP: Activate
The ID represents the unique identifier of this zone. It might make sense to give it a recognisable name. Usually, we do not need to adjust the MTU size for this kind of default setups. However, there may always be some corner cases. In the node sections, this zone can be assigned to specific nodes or simply to all ones. There may also be scenarios where zones might only be limited to specific nodes. According to our advanced options, further details like DNS server and also the forward- & reverse zones can be defined. For this basic setup, this will not be used but the automatic DHCP option must be activated.
Now, the next steps will be done in the chapter VNets where the previously created zone will be linked to a virtual network. In the same step we will also provide additional network information like the network range etc.
When creating a new VNet, an identifier or name must be given. It often makes sense to align the virtual network name to the previously generated zone name. In this example, the same names will be used. Optionally, an alias can be defined. The important part is to select the desired zone that should be used (e.g., devnet01). After creating the new VNet, we have the possibility to create a new subnet in the same window by clicking on the Create Subnet button.
Within this dialog, some basic network information will be entered. In general, we need to provide the desired subnet in CIDR notation (e.g., 10.11.12.0/24). Defining the IP address for the gateway is also possible. In this example the gateway will be placed on the IP address 10.11.12.1. Important is to activate the option SNAT. SNAT (Source Network Address Translation) is a technique to modify the source IP address of outgoing network traffic to appear as though it originates from a different IP address, which is usually the IP address of the router or firewall. This method is commonly employed to allow multiple devices on a private network to access external networks.
After creating and linking the zone, VNet and the subnet, the configuration can simply be applied on the web interface by clicking on the apply button. The configuration will now be synced to the desired nodes (in our example all ones).
Usage
After applying the configuration on the nodes within the cluster, virtual machines must still be assigned to this network. Luckily, this can easily be done by using the regular Proxmox web interface which now also provides the newly created network devnet01 in the networking chapter of the VM. But also already present virtual machines can be assigned to this network.
When it comes to DevOps and automation, this is also available in the API where virtual machines can be assigned to the new network. Such a task could look like in the following example in Ansible:
- name: Create Container in Custom Network
community.general.proxmox:
vmid: 100
node: de01-dus01-node03
api_user: root@pam
api_password: {{ api_password }}
api_host: de01-dus01-node01
password: {{ container_password }}
hostname: {{ container_fqdn }}
ostemplate: 'local:vztmpl/debian-12-x86_64.tar.gz'
netif: '{"net0":"name=eth0,ip=dhcp,ip6=dhcp,bridge=devnet01"}'
Virtual machines assigned to this network will immediately get IP addresses within our previously defined network 10.11.12.0/24 and can access the internet without any further needs. VMs may also moved across nodes in the cluster without any needs to adjust the gateway, even a node get shut down or rebooted for maintenances.
Conclusion
In conclusion, the integration of Software-Defined Networking (SDN) into Proxmox VE represents a huge benefit from a technical, but also from a user perspective where this feature is also usable from the Proxmox’s web ui. This ease of configuration empowers even those with limited networking experience to set up and manage even more complex network setups as well.
Proxmox makes it also easier with simple SDNs to create basic networks that let virtual machines connect to the internet. You don’t have to deal with complicated settings or gateways on the main nodes. This makes it quicker to get virtual setups up and running and lowers the chance of making mistakes that could lead to security problems.
For people just starting out, Proxmox has a user friendly website that makes it easy to set up and control networks. This is really helpful because it means they don’t have to learn a lot of complicated stuff to get started. Instead, they can spend more time working with their virtual computers and not worry too much about how to connect everything.
People who know more about technology will like how Proxmox lets them set up complex networks. This is good for large scaled setups because it can make the network run better, handle more traffic, and keep different parts of the network separate from each other.
Just like other useful integrations (e.g. Ceph), also the SDN integration provides huge benefits to its user base and shows the ongoing integration of useful tooling in Proxmox.
On Thursday, 27 June, and Friday, 28 June 2024, I had the amazing opportunity to attend Swiss PGDay 2024. The conference was held at the OST Eastern Switzerland University of Applied Sciences, Campus Rapperswil, which is beautifully situated on the banks of Lake Zurich in a nice, green environment. With approximately 110 attendees, the event had mainly a B2B focus, although not exclusively. Despite the conference being seemingly smaller in scale compared to PostgreSQL events in larger countries, it actually reflected perfectly the scope relevant for Switzerland.
During the conference, I presented my talk “GIN, BTREE_GIN, GIST, BTREE_GIST, HASH & BTREE Indexes on JSONB Data“. The talk summarized the results of my long-term project at NetApp, including newer interesting findings compared to the presentation I gave in Prague at the beginning of June. As far as I could tell, my talk was well received by the audience, and I received very positive feedback.
At the very end on Friday, I also presented a lightning talk, “Can PostgreSQL Have a More Prominent Role in the AI Boom?” (my slides are at the end of the file). In this brief talk, I raised the question of whether it would be possible to implement AI functionality directly into PostgreSQL, including storing embedding models and trained neural networks within the database. Several people in the audience, involved with ML/AI, reacted positively on this proposal, acknowledging that PostgreSQL could indeed play a more significant role in ML and AI topics.
The conference featured two tracks of presentations, one in English and the other in German, allowing for a diverse range of topics and speakers. I would like to highlight some of them:
- Tomas Vondra presented “The Past and the Future of the Postgres Community“, explaining how work on PostgreSQL changes and fixes is organized in Commitfests and discussing future development ideas within the community.
- Laurenz Albe’s talk, “Sicherheitsattacken auf PostgreSQL“, highlighted several potential attack vectors in PostgreSQL, capturing significant attention with surprising examples.
- Chris Engelbert’s presentation, “PostgreSQL on Kubernetes: Dos and Don’ts“, addressed the main issues related to running PostgreSQL on Kubernetes and discussed solutions, including pros and cons of existing PostgreSQL Kubernetes operators.
- Maurizio De Giorgi and Ismael Posada Trobo discussed “Solving PostgreSQL Connection Scalability Issues: Insights from CERN’s GitLab Service“, detailing the challenges and solutions for scalability in CERN’s vast database environment.
- Dirk Krautschick’s talk, “Warum sich PostgreSQL-Fans auch für Kafka und Debezium interessieren sollten?“, showcased examples of using Debezium connectors and Kafka with PostgreSQL for various use cases, including data migrations.
- Patrick Stählin discussed “Wie wir einen Datenkorruptions-Bug mit der Hilfe der Community gefunden und gefixt haben,” addressing issues with free space map files after migration to PostgreSQL 16.
- Marion Baumgartner’s presentation, “Geodaten-Management mit PostGIS,” provided interesting details about processing geo-data in PostgreSQL using the PostGIS extension.
- Prof. Stefan Keller, one of the main organizers and a professor of Data Engineering at Rapperswil OST University, presented “PostgreSQL: A Reliable and Extensible Multi-Model SQL Database“, discussing the multi-model structure of PostgreSQL amid declining interest in NoSQL solutions.
- Luigi Nardi from DBTune presented “Lessons Learned from Autotuning PostgreSQL“, describing an AI-based performance tuning tool developed by his company.
- Kanhaiya Lal and Belma Canik delved into “Beyond Keywords: AI-powered Text Search with pgvector for PostgreSQL,” exploring the use of the pgvector extension to enhance full-text search capabilities in PostgreSQL.
- Gabriele Bartolini, the creator of the PostgreSQL Kubernetes Operator “CloudNativePG,” discussed the history and capabilities of this operator in his talk, “Unleashing the Power of PostgreSQL in Kubernetes“.
At the end of the first day, all participants were invited to a social event for networking and personal exchange, which was very well organized. I would like to acknowledge the hard work and dedication of all the organizers and thank them for their efforts. Swiss PGDay 2024 was truly a memorable and valuable experience, offering great learning opportunities. I am grateful for the chance to participate and contribute to the conference, and I look forward to future editions of this event. I am also very thankful to NetApp-credativ for making my participation in the conference possible.
Photos by organizers, Gülçin Yıldırım Jelínek and author:
From April, 18th until Friday, 21st the KubeCon in combination with the CloudNativeCon took place in Amsterdam: COMMUNITY IN BLOOM. An exciting event for people with interest in Kubernetes and cloud native technologies.
At credativ, it is must to pre-train us in many relevant areas. This of course includes Kubernetes and Cloud Native technologies. The KubeCon/Cloud Native Con has been one of the conferences on our must-attend list for several years now.
A short diary
We started our journey to the KubeCon by Tuesday evening with the badge pickups. On Wednesday the Keynotes started with the usual welcome words and opening remarks.
The information that 10000 attendees have been registered with additional 2000 people on the wait list was really impressive and shows the importance of Cloud Native technologies. Nearly 58% of the attendees were new to the conference which proves that more and more people get in touch with Kubernetes and Co.
In addition to the common sponsored keynotes a short update of the CNCF graduated projects was presented. There was a wide variation of projects. From FluxCD to Prometheus, Linkerd, Harbor and many more.
The second day started once again with keynotes which included several project updates e.g. Kubernetes and incubating projects.
The last day as usual opened with keynotes. A highlight here was the presentation “Enabling Real-Time Media in Kubernetes” which gave some insights about a Media Streaming Mesh.
Supplemental to the talks and presentations some tutorials happened. Those tutorials usually take at least two time slots and therefore, provide a deeper insight into a specific topic and left room for questions. The tutorials we visited were well prepared and several people were cruising through the attendees to help and answer questions. One of those tutorials showed the usage and benefits of Pixie which provides deep insights into a system using eBPF and various open source projects.
Beyond the tracks a booth location was available, it has been divided (by halls) to the company related booths and an area with projects. NetApp was represented at several booths.
The main theme this year seemed to be all about eBPF and Cilium. Various presentations on different tracks highlighted this topic and showed areas of application for eBPF. Different Cilium talks presented various aspects of Cilium for e.g. observability or multi-cluster connections and application failover.
Not so good
One bad thing has to be mentioned. Some talks were full. Really full. To some of them we got no access due to the fact, that the room was filled 15-30 minutes before the talk started. Maybe it would be possible for the next time to ask all users to create a personal schedule in the corresponding app and reassign the rooms by the amount of interested (scheduled) people.
Keynotes, Talks and Presentations
A short overview about the (visited) highlights of the talks and presentations:
- “Improve Vulnerability Management with OCI Artifacts – It Is That Easy” a great talk about images and artifacts related to trivy
- “Anatomy of a Cloud Security Breach – 7 deadly sins” – a short recap of really occured security breaches. Nothing unknown but a good comprehension.
- “Creating a Culture of Documentation” – about integrating the documentation process and creating a culture.
- “Kubernetes Defense Monitoring with Prometheus” – an encouraging presentation about the usage of metrics
- “Breakpoints in Your Pod: Interactively Debugging Kubernetes Applications” – a great talk about requirements and how to achieve debugging in pods
- “Effortless Open Source Observability with Cilium, Prometheus and Grafana – LGTM!” Highlighting the observability features from Cilium and service dependency maps with Hubble.
Conclusion
As always, the conference was worthwhile for gaining new impressions, having exchange with interesting people and expanding one’s knowledge. We were certainly happy to participate are already looking forward to attending the next KubeCon.
In November 1999, 20 years ago, credativ GmbH was founded in Germany, and thus laid the first foundation for the current credativ group.
At that time, Dr. Michael Meskes and Jörg Folz started the business operations in the Technology Centre of Jülich, Germany. Our mission has always been to not only work to live, but also to live to work, because we love the work we do. Our aim is to support widespread use of open source software and to ensure independence from software vendors.
Furthermore, it is very important for us to support and remain active in open source communities. Since 1999 we have continuously taken part in PostgreSQL and Debian events, and supported them financially with sponsorships. Additionally, the development of the Linux operating system has also been a dear and important project of ours. Therefore, we have been a member of the Linux Foundation for over 10 years.
In 2006 we opened our Open Source Support Center. Here, for the first time, our customers had the opportunity to get the support for their entire Open Source infrastructure with just one contract. Since then we have expanded and included different locations into a globally operating Open Source Support Center.
Thanks to our healthy and steady growth, credativ grew to over 35 employees at its worldwide locations by our 10th anniversary.
Since then, the founding of credativ international GmbH in 2013 marked another milestone in credativ’s history, as the focus shifted from a local to a global market. We were also able to expand into different countries such as the USA and India.
We have grown now to over 80 employees, with 20 years of company history. credativ is now one of the leading providers of services and support for open source software in enterprise use. We thank our customers, business partners, and employees for their time together.
This Artikel was originally written by Philip Haas.
Expansion of Open Source Support Center & PostgreSQL® Competence Center in USA
credativ Group, Maryland, 01/29/2019
credativ group, a leading provider of Open Source solutions and support in both Europe and Asia, announces a strategic expansion into the American market as part of a deal acquiring significant assets of OmniTI Computer Consulting (OmniTI), a highly aligned Maryland technical services firm. The new combined entity forms the basis for the establishment of an enlarged Open Source Support Center and PostgreSQL® Competence Center in a new US headquarters based in Columbia, Maryland.
OmniTI, founded in 1997, has built a client list that reads like a who’s who in tech, including Wikipedia, Google, Microsoft, Gilt, Etsy, and many others. In the process, they developed or contributed to the development of hundreds of Open Source projects, built the OmniOS illumos distribution, and ran the world-renowned Surge conference series. “credativ’s client-first approach and alignment on Open Source makes it a comfortable fit and seamless transition for OmniTI’s staff and customers. After 22 years of business, I’m delighted by this new direction.” says Theo Schlossnagle, Founder of OmniTI, who is leaving the company to concentrate on other activities.
The newly formed US branch of the credativ family has appointed Robert Treat as its CEO. Working in close cooperation with credativ international GmbH, led by Dr. Michael Meskes, Treat will take over further expansion of activities in the USA. A noted Open Source contributor, author, and international speaker, Treat served as both COO and CEO during his time with OmniTI.
Together with the European Open Source Support Center of credativ GmbH, the credativ group will expand its service network for numerous international customers who are currently mainly supported from Europe. Thus the credativ group can extend its unique position as the sole provider of Open Source Support Centers and offer comprehensive support with guaranteed service level agreements for a multitude of open source projects used in today’s business environments.
Robert Treat says “Open Source is at the heart of today’s biggest business disruptors; DevOps and the Cloud. At OmniTI we helped hundreds of companies navigate through these changes over the last 10 years. Now, as part of credativ, we have an even larger pool of experts to choose from to help people master all the necessary aspects of modern technology, including scalability, observability, deployment, automation, and more; all based on the power and flexibility of Open Source.”
Additionally, the US team will now offer a PostgreSQL® Competence Center that ensures the use of the free open source DBMS PostgreSQL® in mission critical applications and supports the entire life cycle of a PostgreSQL® database environment.
In addition, by expanding its existing service and support structure, credativ is one of a very few providers of PostgreSQL® support with a truly global footprint. Dr. Michael Meskes says: “We see to it that the community version of PostgreSQL® can be used as an extremely powerful alternative to the well-known commercial, proprietary databases in the enterprise environment. Apart from the very moderate costs for support, there is no need anymore for further costs for subscriptions or licenses.”
About credativ international GmbH
Founded in 1999, credativ is an independent consulting and services company offering comprehensive services and technical support for the implementation and operation of Open Source software in business applications.
Our Open Source Support Center™ provides the necessary reliability to make use of the numerous advantages of free software for your organization. Offering support around the clock, 365 days a year, our Open Source Support Center™ contains service locations in Germany, India, the Netherlands, Spain, and the United States, providing global premium support for a wide range of Open Source projects that play a fundamental role and are of utmost importance in the IT infrastructures of many companies today.
Moreover, we are advocates for the principles of free software and actively support the development of Open Source software. Most of our consultants are actively involved in numerous Open Source projects, including Debian, PostgreSQL®, Icinga, and many others, and many have been recognized as leading experts in their respective domains.