$0.00

No products in the cart.

sales@glitchdata.com

$0.00

No products in the cart.

Home Blog

Snowflake Attack 2024

0

The Australian Signals Directorate has release intel communications that Snowflake instances are being targetted from a range of IP Addresses. Read ASD’s communications here. Snowflake has further describe the threat in its communications here.

This looks like brute force attacks at common Snowflake accounts prompting advisory to:
>> Update Admin accounts and Disable non-active accounts.

Another quick way to block these attacks is the black-list the offending IP Addresses in your firewall. You can do this in your snowflake account with the following steps.

Secure Artificial Intelligence Act

0

A new Bill called the Secure Artificial Intelligence Act has been tabled in the US Senate. The Bill aims to address security vulnerabilities associated with Artificial Intelligence Systems. The proposal is :

  • to create a database of all confirmed or attempted incidents of security attacks on significant AI systems, and
  • to create  a “Security Center” at the National Security Agency (NSA) to engage in security research for AI systems, and
  • evaluate supply chain risks

 Create a database to track vulnerabilities

The Bill called for ‘National Institute of Standards and Technology’  and the ‘Cybersecurity and Infrastructure Security Agency’ to create a “National Vulnerability Database “. This database would be a public repository of all artificial intelligence security vulnerabilities. The database must allow private sector entities, public sector organizations, civil society groups, and academic researchers to report such incidents.

The database would contain all confirmed or suspected artificial intelligence security and safety incidents while maintaining the confidentiality of the affected party. Incidents to be classified in a manner that supports accessibility, the ability to prioritise responses related to concerning models especially those used in critical infrastructure, safety-critical systems and large enterprises.

The Bill also proposed updating the ‘‘Common  Vulnerabilities and Exposures Program’’, which is the current reference guide and classification system for all information security vulnerabilities sponsored by the Cybersecurity and Infrastructure Security Agency.

Establish an Artificial Intelligence Security Centre

The Security Centre established by the NSA must make available a research test-bed, develop guidance on how to prevent “counter-artificial intelligence techniques”.

Evaluate consensus standards and supply chain risks

The Bill also acknowledged the need to update certain practices while considering AI. The Bill also called to “evaluate whether existing voluntary consensus standards for vulnerability reporting effectively accommodate artificial intelligence security vulnerabilities.” The Bill postulates that there may be a need to update the widely accepted standards for reporting security vulnerabilities with the rise of artificial intelligence.

Further, it called to reevaluate best practices concerning supply chain risks associated with training and maintaining artificial intelligence models. These could include risks associated with:

  • reliance on remote workforce and foreign labour for tasks like data collection, cleaning, and labelling
  • human feedback systems used to refine AI systems
  • inadequate documentation of training data and test data storage, as well as limited provenance of training data
  • using large-scale, open-source datasets in the public and private sector developers in the United States
  • using proprietary datasets containing sensitive or personally identifiable information.

AWS Summit Sydney 2024

0

AWS Summit Sydney recently concluded 2 days of presentations of latest technologies (10-11 Apr) and successes in the AWS ecosystem. Generative AI was definitely a focus with AWS discussing the latest methods using Amazon Q. Check out:

Innovation Day has a strong focus on Artificial Intelligence and ho

See the videos of the session here.

Top 10 Plugins for WordPress in 2024

0

Since 2003, WordPress has retained its popularity as a feature rich platform to rapidly build a business, and establish a web presence. Thanks to a thriving ecosystem of web developers, plugin developers, and theme creators who can literally make WordPress do anything.

So what are the Top 10 plugins that keep the platform amazing? This year, we see a number of AI-driven plugins taking top spot. This includes Chatbots, and Generative AI plugins. Our assessment include the following plugins:

#1: Hubspot

Hubspot is a CRM application with chatbot integrations for WordPress. Investigate futher.

#2: AI Power

AI Power leverages Generative to speed up creating products, posts and content for WordPress. Investigate further.

#3: WooCommerce

WooCommerce retains significant relevance with its ability to provide online shopping for WordPress websites. WooCommerce is supported by Automattic. Investigate further.

#4: BuddyPress

Social media platform engine for WordPress. Although not commonly spoken about, BuddyPress has the ability to create small social media groups on the WordPress platform. Investigate further.

#5: Jetpack

Jetpack is another flagship plugin from Automattic. It contains tools that turn your website into a significantly more robust and performance platform. This ranges from caching, backups and security uplifts. Investigate further.

#6: FIFU

FIFU is a useful tool for WordPress sites that syndicate content. It setups up Featured images for posts automatically using referenced images (or videos) from syndicated content or from its repository of images. FIFU saves you many hours in sourcing images for your website. Investigate further.

#7 Siteground Security Optimizer

This is a pleasantly performant security plugin which tracks logins, hardens a WordPress installation with common treatments, and provides custom logins. Investigate further.

#8: Carbon Fields

Carbon fields is a supporting plugin that facilitates the creation of forms in downstream plugins. Its provides a good alternative to “Advance Custom Fields” which has become prohibitively expensive. Investigate further.

#9: Jetpack CRM

An initiative by Automattic to provide customer management capability. Its not commonly used and needs tighter integration with Chatbots. Its still early days for Jetpack CRM, and it can only get better. Investigate further.

#10: RSS Aggregator

Yes, RSS aggregation is not as popular, but remains the core of the WordPress community where posts and created, back-linked, and shared. RSS Aggregator accelerates content distribution by syndicating and amplifying content. Investigate further.

You read this first at https://glitchdata.com

Digital ID Australia

0

Australia introduces the Digital ID bill. With an already strong identity management regime, the Digital ID Bill seeks to bolster existing arrangements. The key change in Identity is the “digital” element. This is where a networks of identity accredited parties will be governed through ACCC digital platforms.

The Digital ID Bill is funded by $145.5 million in new money. This is appropriated to:

  • Australian Competition and Consumer Commission (ACCC) – $67 million over two-and-a-half years (2023-24) “for the Australian Competition and Consumer Commission (ACCC) to perform interim regulatory functions under the Digital ID legislation from 1 July 2024.”
  • Attorney-General’s Department – $56 million over four years for the Attorney-General’s Department “for the continued operation of the Identity Matching Services” and “a further $3.3 million to enhance the Credential Protection Register to enable the Government to respond to future data breaches and support and protect victims of identity crime (as previously announced as part of the 2023-2030 Australian Cyber Security Strategy).”
  • Funding for key priorities including ICT updates to myGovID, communications to improve individual and business awareness and understanding of Digital ID, supporting the Office of the Australian Information Commissioner to prepare for its privacy oversight role of Digital ID and enabling the Department of the Treasury to support the ACCC to deliver its Digital ID functions and scope options for a data and digital regulator.

https://ministers.ag.gov.au/media-centre/strengthening-australias-digital-id-system-30-11-2023
https://www.aph.gov.au/Parliamentary_Business/Bills_Legislation/Bills_Search_Results/Result?bId=s1404
https://www.digitalidentity.gov.au/digital-id-bill

Which Cloud?

0

SAP IDM end-of-life in 2027

0

Love it or hate it, SAP IDM is deferring for alternative IDM solutions come 2027. SAP announced this on their website with Microsoft Entra ID mentioned as an alternative.

There are plenty of other IDM alternatives, but each will require integration.

https://community.sap.com/t5/technology-blogs-by-sap/preparing-for-sap-identity-management-s-end-of-maintenance-in-2027/ba-p/13596101

Top 10 Data Breaches in 2023

0
Data Security Text with Padlock Icon - Red Button on Black Computer Keyboard.

Organisation nameSectorLocationKnown records breachedMonth of public disclosure
1DarkBeamCyber securityUK>3,800,000,000September
2Real Estate Wealth NetworkConstruction/ real estateUSA1,523,776,691December 
3Indian Council of Medical Research (ICMR)HealthcareIndia815,000,000October
4Kid SecurityIT services/ softwareKazakhstan>300,000,000November
5Twitter (X)IT services/ softwareUSA>220,000,000January
6TuneFabIT services/ softwareHong Kong>151,000,000December 
7Dori Media GroupMediaIsrael>100 TB*December 
8TigoTelecomsHong Kong>100,000,000July
9SAP SE BulgariaIT services/ softwareBulgaria95,592,696November
10Luxottica GroupManufacturingItaly70,000,000May

The Sanctity of Access

0

The collective store of knowledge has been generated by human in forums, reddit, stackoverflow, websites, Wiki, and search engines. We have used this to train ChatGPT. Since GPT4, the generation of knowledge is moving from the public-sphere into a direct private chat. This is a problem.

So the generation and capture of knowledge is now privately sourced and pooled into a Machine by-passing the human public domain.

Where will the next AI model get its training data from? GPT-Next will be trained on legacy data.

This raises multiple questions related to Ethics, the sanctity of Data Access, and an increasing importance to legislate for public data.

AI will become a dominant source of knowledge simply by virtue-of-growth. It depends on training data which could become unavailable, or monopolised, or licensed by a chatbot.

Trusting the output from AI is just as alarming. If we lose access to the training data, and propagate AI outputs we will lose the collective memory of human-generated thinking and lose the ability to validate AI. Just like search, humanity will be told what to think, how to think, and programmed by AI.

So to avoid a dangerous feedback loop, we need to individually, and collectively support (and extend) the public domain.

What is ChatGPT?

0

So 2023 started with a big buzz on ChatGPT. OpenAI announced ChatGPT availability just before Christmas, and soon the internet was abuzz with excitement. ChatGPT app gained some 1 million users after 5 days, and 10million users after 40 days. This is way faster that previous upstarts like Instagram, Twitter etc…

So what exactly is ChatGPT? It’s an intelligent chatbot driven by the GPT-3 model. GPT stand for “Generative Pre-training Transformer”. Essentially, its a weighted Neural Network with 175 billion weighted connections. This is the largest trained neural network to-date. Microsoft maintains the next largest “Turing-NLG” model at 17 billion connections. GPT-4 will have ~500 billion connections. The human brain has approximately 86 billion neurons (sometimes less). The model includes reinforcement learning, and supervised learning. There are also aspects to the architecture which include components like The Encoder, The Decoder, Language Model, Pre-trainers, Fine-tuners etc…

So ChatGPT-3 has sparked lots of interesting conversations and use cases. People have started to apply it to work, assignments, content-generation, coding, and general life questions. This was made possible through the training of the model with a terabyte of data. Will it surpass the Google search engine simply by being able to create a more intelligent answer beyond the search result?

Architecture

  1. The Transformer architecture: The Transformer architecture is the foundation of ChatGPT. It is a neural network architecture that uses self-attention mechanisms to process input sequences. The transformer architecture is able to handle input sequences of varying lengths and allows for parallel processing of the input.
  2. The Encoder: The Encoder is composed of multiple layers of self-attention and feed-forward neural networks. It processes and understands the input text.
  3. The Decoder: The decoder is also composed of multiple layers of self-attention and feed-forward neural networks. It generates the output text.
  4. The Language Model Head: The language model head is a linear layer with weights that are learned during pre-training. It is used to predict the next token in the sequence, given the previous tokens.
  5. The Dialogue Generation Head: The dialogue generation head is a linear layer with weights that are learned during fine-tuning the model on conversational data. It is used to generate the response to a given prompt in the context of a dialogue.
  6. Pre-training: ChatGPT is pre-trained on a large dataset of text, which enables it to generate human-like text in response to a given prompt.
  7. Fine-Tuning: The model is fine-tuned on conversational data to improve its ability to generate responses in the context of a dialogue.

Training

Training is another interesting consideration. How much data is needed to train GPT-3? (570 gigabytes of text). How long did it take? ChatGPT is trained (censored/biased) to avoid returning harmful answers. As the size of the model increases, training time and data will also significantly increase.

Ethics

A concern for the future is that ChatGPT model are biased. Another concern is that humans continue to get progressively obsolete. Ethics for AI is going to be crucial for humanity.

ChatGPT Trajectory

  1. GPT-1 (Generative Pre-trained Transformer 1) was the first version (in June 2018) of ChatGPT released by OpenAI. It was pre-trained on a dataset of 40GB of text data and had a capacity of 1.5 billion parameters.
  2. GPT-2 (Generative Pre-trained Transformer 2) was released shortly after in Feb 2019. It was pre-trained on a much larger dataset of 570GB of text data and had a capacity of 1.5 trillion parameters, making it ten times larger than GPT-1.
  3. GPT-3 (Generative Pre-trained Transformer 3) was released in 2020. It was pre-trained on a massive dataset of 570GB of text data and had a capacity of 175 billion parameters. It was fine-tuned for a wide range of language tasks, such as text generation, language translation, and question answering.
  4. GPT-4 (Generative Pre-trained Transformer 4) was released in 2021, it was pre-trained on a massive dataset of many terabytes of text data and had a capacity of over 500 billion parameters. It was fine-tuned for a wide range of language tasks, such as text generation, language translation, and question answering with even more accuracy and fluency than GPT-3.

Current Limitations of ChatGPT

  • GPT-3 lacks long-term memory — the model does not learn anything from long-term interactions like humans.
  • Lacks interpretability — this is a problem that affects extremely large and complex in general. GPT-3 is so large that it is difficult to interpret or explain the output that it produces.
  • Limited input size — transformers have a fixed maximum input size and this means that prompts that GPT-3 can deal with cannot be longer than a few sentences.
  • Slow inference time — because GPT-3 is so large, it takes more time for the model to produce predictions. Imagine how long GPT-4 will take?
  • GPT-3 suffers from bias — all models are only as good as the data that was used to train them and GPT-3 is no exception. The data for GPT-3 and other large language models contain biases. This already intentionally includes “hate”-speech biases, religious biases, political biases.
  • Training Time – With GPT-4 coming-in with 500B parameters, we can see a 2.8x increase in parameters. Is the trajectory slowing down?

Here are some links to articles: