derian-cordoba/laravel-ab-testing
最新稳定版本:1.0.0
Composer 安装命令:
composer require derian-cordoba/laravel-ab-testing
包简介
AB testing package for laravel
README 文档
README
Standalone experimentation primitives for Laravel: typed experiments, deterministic bucketing, reusable metrics, Bayesian and frequentist analysis, and a package architecture that can grow into a full experimentation platform.
This package is intentionally not a thin feature-flag helper. It is designed around real experimentation concerns: stable assignment units, explicit control and traffic allocation, reusable metrics, sample-ratio mismatch detection, and normalized experiment definitions that can be consumed by both code-defined and runtime-defined workflows.
Contents
- Current status
- Why this package
- Installation
- Quick start
- Defining units
- Defining metrics
- Defining variants
- Defining experiments
- Registering experiments
- Resolving variants
- Tracking metrics
- Real-world request flow
- Dashboard
- Discovery and caching
- Running statistical analysis
- Feature flags
- Architecture
- Production integration
- Testing
- Current limitations
- Contributing
- License
Current status
This repository already includes the foundational engine and several important runtime pieces:
- Attribute-based experiment definitions via
#[AsExperiment],#[AsMetric],#[AsUnit], and related attributes - Typed variant enums with explicit control and weight declarations
- Deterministic bucketing with an SHA-256 strategy
- Assignment persistence behind repository contracts
- A resolution pipeline for eligibility, traffic checks, sticky assignment, and layer exclusion
- Database-backed event recording for exposures and metrics
- Incremental rollups over raw events for dashboard reads
- Bayesian and frequentist analysis engines
- Sample-ratio mismatch detection
- A built-in Livewire dashboard with experiment overview, detail pages, controls, audit trail, and results tables
- Experiment registration from configuration and optional discovery
Some larger platform pieces are still intentionally not presented as finished:
- Runtime-defined experiment authoring in the dashboard
- Full feature-flag registry and runtime integration
The README below focuses on what exists today and shows how to use the package honestly in its current form.
Why this package
Most Laravel A/B testing packages stop at "split users into two buckets and count conversions". This package is aiming at a stricter model:
- Experiments are typed PHP classes, not magic strings
- Variants are backed enums, not loose arrays
- Assignment is deterministic and sticky
- Assignment and exposure are different events
- Metrics are reusable definitions with explicit roles
- Analysis is unit-based, not raw-event-based
- Frequentist and Bayesian results can live side by side
- The engine consumes a normalized
ExperimentDefinition, so code-defined and runtime-defined experiments can converge on the same core
Installation
Install the package with Composer:
composer require derian-cordoba/laravel-ab-testing
Publish the configuration file:
php artisan vendor:publish --tag=ab-testing-config
Run the migrations:
php artisan migrate
If you want to customize the package dashboard views, publish them:
php artisan vendor:publish --tag=ab-testing-dashboard-views
The package supports Laravel package discovery, so you should not need to register the service provider manually.
Requirements
- PHP
^8.4 - Laravel components
illuminate/support ^13.0 illuminate/console ^13.0
Quick start
The package revolves around four structural definitions:
- A unit type that can be bucketed
- A reusable metric
- A backed enum of variants
- An experiment class that ties them together
Here is a complete minimal example.
1. Define the assignment unit
<?php declare(strict_types=1); namespace App\ABTesting; use ABTests\Attributes\AsUnit; use ABTests\Contracts\Bucketable; use App\Models\User; #[AsUnit(key: 'user')] final readonly class ExperimentUser implements Bucketable { public function __construct(private User $user) { // } public function bucketingKey(): string { return "user:{$this->user->getKey()}"; } public function attributes(): array { return [ 'plan' => $this->user->plan, 'country' => $this->user->country, ]; } }
2. Define the metric
<?php declare(strict_types=1); namespace App\ABTesting\Metrics; use ABTests\Attributes\AsMetric; use ABTests\Enums\Aggregate; use ABTests\Enums\MetricType; use ABTests\Metric; #[AsMetric( key: 'checkout-conversion', type: MetricType::binary, event: 'checkout.completed', aggregate: Aggregate::uniqueUnits, )] final class CheckoutConversion extends Metric { // }
3. Define the variants
<?php declare(strict_types=1); namespace App\ABTesting\Variants; use ABTests\Attributes\Control; use ABTests\Attributes\Weight; use ABTests\Concerns\IsVariant; use ABTests\Contracts\Variant; enum CheckoutButtonVariant: string implements Variant { use IsVariant; #[Control] #[Weight(50)] case control = 'control'; #[Weight(50)] case green = 'green'; }
4. Define the experiment
<?php declare(strict_types=1); namespace App\ABTesting\Experiments; use ABTests\Attributes\Analysis; use ABTests\Attributes\AsExperiment; use ABTests\Attributes\PrimaryMetric; use ABTests\Enums\StatisticalEngine; use ABTests\Experiment; use ABTests\Values\Segment; use App\ABTesting\ExperimentUser; use App\ABTesting\Metrics\CheckoutConversion; use App\ABTesting\Variants\CheckoutButtonVariant; #[AsExperiment( key: 'checkout-button-color', unit: ExperimentUser::class, variants: CheckoutButtonVariant::class, name: 'Checkout button color', layer: 'checkout-ui', )] #[PrimaryMetric(CheckoutConversion::class)] #[Analysis( engine: StatisticalEngine::both, confidenceLevel: 0.95, sequential: true, )] final class CheckoutButtonColor extends Experiment { public function audience(): Segment { return Segment::where('plan', 'pro')->and('country', 'US'); } }
5. Register the experiment
Add it to config/ab-testing.php:
'experiments' => [ \App\ABTesting\Experiments\CheckoutButtonColor::class, ],
6. Resolve the variant
use ABTests\Experiments; use App\ABTesting\ExperimentUser; use App\ABTesting\Experiments\CheckoutButtonColor; use App\ABTesting\Variants\CheckoutButtonVariant; $user = new ExperimentUser($request->user()); $variant = Experiments::for($user)->variant(CheckoutButtonColor::class); return match ($variant) { CheckoutButtonVariant::green => view('checkout.green'), CheckoutButtonVariant::control, null => view('checkout.default'), };
7. Track the conversion
use ABTests\Experiments; use App\ABTesting\Metrics\CheckoutConversion; Experiments::track(CheckoutConversion::class, for: $user);
Defining units
Every experiment and metric event is anchored to a unit that implements ABTests\Contracts\Bucketable.
A unit must provide:
bucketingKey(): the stable identifier used for deterministic assignmentattributes(): a flat attribute array used for audience targeting
Use #[AsUnit] to give each unit type a stable storage key:
#[AsUnit(key: 'tenant')] final class TenantUnit implements Bucketable { // ... }
The key argument accepts either:
- a plain string
- a backed enum case
- a unit enum case, which resolves to its case name
That normalization follows the same semantics as Laravel's enum_value() helper.
Defining metrics
Metrics are reusable definitions. An experiment does not describe how a conversion is measured; it references one or more metrics and assigns them roles.
Available metric attributes:
#[PrimaryMetric(...)]#[SecondaryMetric(...)]#[Guardrail(...)]
Example:
#[PrimaryMetric(CheckoutConversion::class)]
#[SecondaryMetric(RevenuePerVisitor::class)]
#[Guardrail(ErrorRate::class, maximumRegression: 0.01)]
The #[AsMetric] attribute describes the metric itself:
#[AsMetric(
key: 'revenue-per-visitor',
type: MetricType::continuous,
event: 'checkout.completed',
aggregate: Aggregate::sum,
valueFromProperty: 'revenue',
attributionWindow: '7 days',
)]
final class RevenuePerVisitor extends Metric
{
public function valueOf(array $properties): float
{
return (float) ($properties['revenue'] ?? 0.0);
}
}
Metric roles
- Primary metric: drives the ship or do-not-ship decision
- Secondary metric: useful for supporting interpretation
- Guardrail metric: must not regress beyond the allowed threshold
Raw metric keys are also supported
The runtime tracking layer accepts either:
- a metric class-string, such as
CheckoutConversion::class - a raw metric key, such as
'checkout-conversion'
That matters because the core engine is designed to support both code-defined and runtime-defined experiments.
Defining variants
Variants must be backed enums that implement ABTests\Contracts\Variant. The easiest way to satisfy the contract is to use the ABTests\Concerns\IsVariant trait.
Rules enforced by the package:
- each variant must have a
#[Weight(...)] - weights must sum to
100 - exactly one variant must carry
#[Control]
Example with three arms:
enum PricingPageVariant: string implements Variant { use IsVariant; #[Control] #[Weight(34)] case control = 'control'; #[Weight(33)] case headlineA = 'headline_a'; #[Weight(33)] case headlineB = 'headline_b'; }
The package preserves the original enum case on resolution, so downstream application code can use match exhaustively and stay type-safe.
Defining experiments
Use #[AsExperiment] on a class extending ABTests\Experiment.
#[AsExperiment(
key: 'pricing-page-headline',
unit: ExperimentUser::class,
variants: PricingPageVariant::class,
name: 'Pricing page headline',
layer: 'pricing-page',
)]
Experiment structure
The attribute captures the structural parts of an experiment:
- stable experiment key
- assignment unit type
- variant enum
- optional human-readable name
- optional mutual-exclusion layer
These are treated as code-owned structure. In the intended architecture, operational state such as running, paused, and traffic percentage belongs to persistence and dashboard workflows, not the class definition itself.
Audience targeting
Override audience() when the experiment should only apply to a segment:
use ABTests\Experiment; use ABTests\Values\Segment; final class PricingPageHeadline extends Experiment { public function audience(): Segment { return Segment::where('country', 'US') ->and('plan', 'pro'); } }
The current Segment API supports conjunctions of criteria and operators such as:
equalsnotEqualsinnotIngreaterThanlessThan
Registering experiments
The package boots an ExperimentRegistry from configuration.
In config/ab-testing.php:
return [ 'experiments' => [ \App\ABTesting\Experiments\CheckoutButtonColor::class, \App\ABTesting\Experiments\PricingPageHeadline::class, ], 'discovery' => [ 'enabled' => false, 'paths' => [ // app_path('Experiments'), ], ], ];
If an experiment cannot be read at boot, the service provider logs the failure instead of crashing the whole app.
Resolving variants
The main entry point is ABTests\Experiments.
$variant = Experiments::for($unit)->variant(CheckoutButtonColor::class);
Resolution goes through a pipeline with the following responsibilities:
- check whether the experiment is active
- verify segment eligibility
- verify traffic allocation
- load any existing sticky assignment
- enforce layer exclusion
- compute a deterministic bucket when needed
- persist the assignment
When a variant is successfully resolved, the package records an exposure event through the configured EventSink.
Null return values
variant() returns null when the unit should not receive the experiment at all. Typical reasons include:
- the experiment is not active
- the unit does not match the audience segment
- the unit falls outside the active traffic percentage
- the unit is excluded by a layer conflict
That is intentional. "Not eligible" is not the same thing as "assigned to control".
Tracking metrics
Track a metric for a unit with either the facade or the per-unit resolver:
Experiments::track(CheckoutConversion::class, for: $unit); Experiments::for($unit)->track(CheckoutConversion::class);
You can also pass a raw metric key:
Experiments::for($unit)->track('checkout-conversion');
For continuous metrics:
Experiments::for($unit)->track('revenue-per-visitor', value: 149.99);
Tracking behavior:
- the metric key is normalized from the metric class when possible
- the resolver iterates over registered experiment definitions
- only experiments with a current assignment for that unit receive a metric event
- the event is recorded with a unique idempotency key
Real-world request flow
Below is a realistic end-to-end checkout flow showing where variant resolution and metric tracking usually belong in an application.
Controller example
<?php declare(strict_types=1); namespace App\Http\Controllers; use ABTests\Experiments; use App\ABTesting\ExperimentUser; use App\ABTesting\Experiments\CheckoutButtonColor; use App\ABTesting\Metrics\CheckoutConversion; use App\ABTesting\Variants\CheckoutButtonVariant; use Illuminate\Http\Request; use Illuminate\View\View; final class CheckoutController { public function show(Request $request): View { $user = new ExperimentUser($request->user()); $variant = Experiments::for($user)->variant(CheckoutButtonColor::class); return match ($variant) { CheckoutButtonVariant::green => view('checkout.green'), CheckoutButtonVariant::control, null => view('checkout.default'), }; } public function complete(Request $request): \Illuminate\Http\RedirectResponse { $user = new ExperimentUser($request->user()); // Your real checkout logic goes here. Experiments::track(CheckoutConversion::class, for: $user); return redirect()->route('checkout.success'); } }
Practical guidance
- Resolve the variant at the point where the user actually experiences the change, not earlier in the request chain for convenience.
- Track the metric at the point where the outcome actually happens, not when you merely intend it to happen.
- Rebuild the same unit shape consistently across requests so sticky assignment remains meaningful.
- Treat
nullas "this user is not in the experiment", not as "fallback control". - Keep experiment keys, metric keys, and unit keys stable over time. Changing them creates a new experiment from the engine's point of view.
A good mental model
One safe way to think about the package is:
variant()is for exposuretrack()is for outcomes
If those two calls happen in the correct places, the rest of the package architecture stays coherent.
Dashboard
The package ships with a Livewire dashboard mounted at /ab-testing by default. It is intended to be the operational surface for experiment state:
- overview of registered experiments
- per-experiment results and analysis
- start, pause, resume, stop, archive, kill-switch, and traffic ramp controls
- audit log entries for state transitions
- manual rollup refresh when you want new data immediately
Dashboard access
The dashboard route prefix, middleware stack, and viewer gate are configured in config/ab-testing.php:
'dashboard' => [ 'path' => env('AB_TESTING_DASHBOARD_PATH', 'ab-testing'), 'middleware' => ['web'], 'viewer_gate' => 'viewAbTestingDashboard', 'results_cache_ttl_seconds' => (int) env('AB_TESTING_RESULTS_CACHE_TTL', 300), 'auto_schedule_rollups' => (bool) env('AB_TESTING_AUTO_SCHEDULE_ROLLUPS', true), ],
Define the gate in your application if you want to restrict access:
use Illuminate\Support\Facades\Gate; Gate::define('viewAbTestingDashboard', function ($user): bool { return $user->is_admin; });
If the configured gate does not exist, the package middleware falls back to whatever access rules your configured dashboard middleware already enforces.
What the dashboard reads
The dashboard does not scan raw event tables on every request. It reads pre-aggregated rows from ab_testing_rollups through ResultsService, then runs the analysis layer on those summaries.
That separation matters:
- raw events remain append-only
- the request path stays fast
- manual refreshes and scheduled rollups produce the dashboard data intentionally
Making results appear
To see data in the dashboard:
- Run the package migrations.
- Register your experiments in
config/ab-testing.php. - Ensure each experiment has an operational state row in
ab_testing_experiments. - Put the experiment into
runningstate, either through the dashboard or your own setup code. - Generate exposures and metric events by resolving variants and tracking outcomes.
- Let the scheduled rollup run or use the dashboard's
Refresh Datacontrol.
Rollup scheduling
When ab-testing.dashboard.auto_schedule_rollups is enabled, the package registers RefreshRollupsJob with Laravel's scheduler automatically. You still need a normal scheduler/queue setup in the host app:
php artisan schedule:work
Or your usual production scheduler:
* * * * * php /path/to/artisan schedule:run >> /dev/null 2>&1
For local development, the dashboard can also trigger a manual per-experiment refresh so you do not have to wait for the next scheduled cycle.
Discovery and caching
The package includes an ab:cache Artisan command:
php artisan ab:cache
When discovery is enabled in config, the command scans configured paths for PHP classes, extracts class names without executing the files, reads #[AsExperiment] definitions, and registers them in the runtime registry.
Example config:
'discovery' => [ 'enabled' => true, 'paths' => [ app_path('Experiments'), ], ],
This is useful when you want explicit deployment-time registration without manually maintaining a long config list.
Running statistical analysis
The analysis layer is already present as a framework-agnostic service. It expects pre-aggregated unit-level summaries rather than raw events.
use ABTests\Statistics\AnalysisService; use ABTests\Values\GenericVariant; use ABTests\Values\MetricSummary; $control = new MetricSummary( variant: new GenericVariant('control', 50, true), countOfUnits: 1000, sumOfValues: 120, sumOfSquaredValues: 120, conversions: 120, ); $treatment = new MetricSummary( variant: new GenericVariant('green', 50), countOfUnits: 1000, sumOfValues: 145, sumOfSquaredValues: 145, conversions: 145, ); $result = app(AnalysisService::class)->analyse( definition: $definition, control: $control, treatment: $treatment, allSummaries: [$control, $treatment], ); $result->verdict; // ship | doNotShip | inconclusive $result->frequentist; // AnalysisResult|null $result->bayesian; // AnalysisResult|null $result->srm; // SampleRatioMismatchResult
What the analysis layer does
- runs the configured engine: frequentist, Bayesian, or both
- performs sample-ratio mismatch detection across all summaries
- returns a
VerdictResultwith both raw engine outputs and the final verdict
Statistical defaults
By default, experiments use:
StatisticalEngine::both0.95confidence- sequential inference enabled
These defaults come from AnalysisConfiguration::default().
Feature flags
The package already contains a FeatureFlag base class and #[AsFeatureFlag] attribute:
use ABTests\Attributes\AsFeatureFlag; use ABTests\FeatureFlag; use ABTests\Values\Context; use App\ABTesting\ExperimentUser; #[AsFeatureFlag( key: 'new-billing-page', unit: ExperimentUser::class, defaultValue: false, )] final class NewBillingPageFlag extends FeatureFlag { public function resolve(Context $context): mixed { if ($context->attribute('plan') === 'enterprise') { return true; } return $this->rollout(20, $context); } }
At the moment, the flag primitives are present, but the full flag registry and application-facing runtime integration are not yet documented as finished in this package. Treat this as a foundation for upcoming work rather than a fully polished surface.
Architecture
The package follows a layered design:
- Domain-like definitions and value objects
- Resolution and analysis services
- Infrastructure behind contracts
- Laravel service provider integration
Key contracts and abstractions:
BucketingStrategyAssignmentRepositoryExperimentStateRepositoryEventSinkAnalysisEngine
Default bindings:
Sha256BucketingStrategyDatabaseAssignmentRepository(database driver, default)DatabaseExperimentStateRepository(database driver, default)DatabaseEventSink(database driver, default)NullEventSink(in-memory driver)
The storage driver is configurable; the in-memory alternatives are available for tests and local development without a database connection (see Storage driver below).
Production integration
The package ships with a full Eloquent database layer enabled by default. Running the migrations is the only step needed to go from zero to a working persistent store.
Storage driver
The storage driver controls which repository implementations are bound. Set it in config/ab-testing.php or via environment variable:
// config/ab-testing.php 'storage' => [ 'driver' => env('AB_TESTING_DRIVER', 'database'), ],
| Driver | Assignment repository | State repository | When to use |
|---|---|---|---|
database (default) |
DatabaseAssignmentRepository |
DatabaseExperimentStateRepository |
Production and staging |
in_memory |
InMemoryAssignmentRepository |
AlwaysRunningExperimentStateRepository |
Unit tests, local dev without a DB |
Migrations
The package loads its migrations automatically. If you prefer to publish them to your application's database/migrations directory:
php artisan vendor:publish --tag=ab-testing-migrations
The package migrations create these tables:
ab_testing_experiments — mutable operational state driven from the dashboard (status, traffic percentage, kill switch). One row per experiment key.
ab_testing_assignments — sticky, deterministic bucketing. One row per (experiment_key, unit_type, unit_key) triple. The composite primary key enforces idempotency: the first write wins and duplicates are silently discarded.
ab_testing_events — append-only exposure and metric event log.
ab_testing_rollups — pre-aggregated experiment summaries used by the dashboard and analysis layer.
ab_testing_guardrail_breaches — breach records emitted when a guardrail crosses its threshold.
ab_testing_audit_log — operational audit trail for dashboard actions.
ab_testing_feature_flag_states — persisted state for feature flags; UI surface is still limited.
ab_testing_experiments.target_sample_size — nullable progress target shown in the dashboard when set.
Creating experiment state rows
The DatabaseExperimentStateRepository looks up a row in ab_testing_experiments by key. If no row exists, the resolver treats the experiment as not running and returns null. You need to insert a row before an experiment can go live, either via the dashboard or your own bootstrap code:
use ABTests\Infrastructure\Database\Models\ExperimentModel; ExperimentModel::create([ 'key' => 'checkout-button-color', 'status' => 'running', 'traffic_percentage' => 50, 'is_killed' => false, ]);
Event sink
When AB_TESTING_DRIVER=database, the package binds DatabaseEventSink automatically and flushes its buffered events at the end of each request. In-memory mode intentionally keeps the no-op sink:
'storage' => [ 'driver' => env('AB_TESTING_DRIVER', 'database'), ],
Production checklist
- Run
php artisan migrateto create the package tables - Insert experiment rows in
ab_testing_experimentsbefore starting experiments - Expose the dashboard behind appropriate middleware or a viewer gate
- Run a scheduler so
RefreshRollupsJobcan keepab_testing_rollupscurrent - If you queue rollups off
sync, run the configured queue worker too - Keep experiment operational state outside the PHP class definition
- Ensure your unit attributes are reproducible across requests
- Monitor failed registration logs during deploys
- Run
php artisan ab:cacheas part of deployment if you use discovery
What not to do
- Do not change variant keys or experiment keys mid-flight.
- Do not count raw events directly as if they were unit-level observations for statistical decisions.
- Do not treat control assignment and ineligibility as the same thing.
Testing
Run the full test suite:
composer test
Or directly with PHPUnit:
vendor/bin/phpunit
Static analysis:
composer analyse
The repository also includes a GitHub Actions workflow for tests.
Current limitations
This package is not pretending to be more finished than it is. Before adopting it in production, keep these points in mind:
- The dashboard is intentionally focused on code-defined experiments in v1; runtime-created experiment authoring is not the main path yet
- Results depend on rollups, so very fresh events are not visible until the scheduled job or manual refresh runs
- Feature-flag definition primitives exist, but the end-user flag runtime is not yet the main surface
- The architecture is designed for runtime-defined experiments too, but the most complete experience today is the code-defined attribute flow
If you want to evaluate the package today, the strongest implemented path is:
- code-defined experiments
- deterministic resolution and Eloquent-backed sticky assignment
- metric tracking contracts
- analysis primitives
Contributing
Contributions are welcome, especially around:
- production persistence implementations
- documentation examples
- feature-flag runtime integration
- dashboard-facing read models
- additional tests around the resolution and analysis layers
If you open a pull request, keep the current package direction in mind:
- strong typing over stringly APIs
- stable experiment, metric, and unit keys
- explicit structure in code
- replaceable infrastructure
License
This package is open-sourced under the MIT license.
统计信息
- 总下载量: 0
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 0
- 点击次数: 3
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2026-06-11
