* shader_recompiler: Implement AMD buffer bounds checking behavior.
* shader_recompiler: Use SRT flatbuf for bounds check size.
* shader_recompiler: Fix buffer atomic bounds check.
* buffer_cache: Prevent false image-to-buffer sync.
Lowering vertex fetch to formatted buffer surfaced an issue where a CPU modified range may be overwritten with stale GPU modified image data.
* Address review comments.
* shader_recompiler: Replace texel buffers with in-shader buffer format interpretation
* shader_recompiler: Move 10/11-bit float conversion to functions and address some comments.
* vulkan: Remove VK_KHR_maintenance5 as it is no longer needed for buffer views.
* shader_recompiler: Add helpers for composites and bitfields in pack/unpack.
* shader_recompiler: Use initializer_list for bitfield insert helper.
* shader_recompiler: Account for instruction array flag in image type.
* shader_recompiler: Check da flag for all mimg instructions.
* shader_recompiler: Convert cube images into 2D arrays.
* shader_recompiler: Move image resource functions into sharp type.
* shader_recompiler: Use native AMD cube instructions when possible.
* specialization: Fix buffer storage mistake.
* shader_recompiler: Add swizzle support for unsupported formats.
* renderer_vulkan: Rework MRT swizzles and add unsupported format swizzle support.
* shader_recompiler: Clean up swizzle handling and handle ImageRead storage swizzle.
* shader_recompiler: Fix type errors
* liverpool_to_vk: Remove redundant clear color swizzles.
* shader_recompiler: Reduce CompositeConstruct to constants where possible.
* shader_recompiler: Fix ImageRead/Write and StoreBufferFormatF32 types.
* amdgpu: Add a few more unsupported format remaps.
* Implement IMAGE_GATHER4_O
Used by The Last of Us Remastered.
* Fix type on IMAGE_GATHER4_C_LZ
Had a different set of types compared to the other IMAGE_GATHER4 ops.
* IMAGE_GATHER4
* shader_recompiler: Tessellation WIP
* fix compiler errors after merge
DONT MERGE set log file to /dev/null
DONT MERGE linux pthread bb fix
save work
DONT MERGE dump ir
save more work
fix mistake with ES shader
skip list
add input patch control points dynamic state
random stuff
* WIP Tessellation partial implementation. Squash commits
* test: make local/tcs use attr arrays
* attr arrays in TCS/TES
* dont define empty attr arrays
* switch to special opcodes for tess tcs/tes reads and tcs writes
* impl tcs/tes read attr insts
* rebase fix
* save some work
* save work probably broken and slow
* put Vertex LogicalStage after TCS and TES to fix bindings
* more refactors
* refactor pattern matching and optimize modulos (disabled)
* enable modulo opt
* copyright
* rebase fixes
* remove some prints
* remove some stuff
* Add TCS/TES support for shader patching and use LogicalStage
* refactor and handle wider DS instructions
* get rid of GetAttributes for special tess constants reads. Immediately replace some upon seeing readconstbuffer. Gets rid of some extra passes over IR
* stop relying on GNMX HsConstants struct. Change runtime_info.hs_info and some regs
* delete some more stuff
* update comments for current implementation
* some cleanup
* uint error
* more cleanup
* remove patch control points dynamic state (because runtime_info already depends on it)
* fix potential problem with determining passthrough
---------
Co-authored-by: IndecisiveTurtle <47210458+raphaelthegreat@users.noreply.github.com>
* shader_recompiler: Move sampling parameter resolution to tracking pass and support more derivative types.
* shader_recompiler: Only track sampler sharp on sample instructions.
* shader_recompiler: Fix Inst args size.
* video_core: texture: image subresources state tracking
* shader_recompiler: use one binding if the same image is read and written
* video_core: added rebinding of changed textures after overlap resolve
* don't use pointers; slight `FindTexture` refactoring
* video_core: buffer_cache: don't copy over the image size
* redundant barriers removed; fixes
* regression fixes
* texture_cache: 3d texture layers count fixup
* shader_recompiler: support for partially bound cubemaps
* added support for cubemap arrays
* don't bind unused color buffers
* fixed depth promotion to do not use stencil
* doors
* bonfire lit
* cubemap array index calculation
* final touches
* shader_recompiler: Add more format swap modes
* texture_cache: Handle stencil texture reads
* emulator: Support loading font library
* readme: Add thanks section
* shader_recompiler: Constant buffers as integers
* shader_recompiler: Typed buffers as integers
* shader_recompiler: Separate thread bit scalars
* We can assume guest shader never mixes them with normal sgprs. This helps avoid errors where ssa could view an sgpr write dominating a thread bit read, due to how control flow is structurized, even though its not possible in actual control flow
* shader_recompiler: Implement data append/consume operations
* clang format
* buffer_cache: Simplify invalidation scheme
* video_core: Remove some invalidation remnants
* adjust
* Implement some missing shader opcodes
Implements TBUFFER_STORE_FORMAT_XYZW, IMAGE_SAMPLE_CD, and IMAGE_GATHER4_C_LZ.
These are seen in https://github.com/shadps4-emu/shadPS4/issues/496.
* Implement IMAGE_STORE_MIP
Not sure if this is the right way to do this, let me know if this needs changing.
* Revert "Implement IMAGE_STORE_MIP"
This reverts commit cff78b5924.
* video_core: Compile shader permutations
* spirv: Only specific storage image format for atomics
* ir: Avoid cube coord patching for storage image
* spirv: Fix default attributes
* data_share: Add more instructions
* video_core: Query storage flag with runtime state
* kernel: Use std::list for semaphore
* video_core: Use texture buffers for untyped format load/store
* buffer_cache: Limit view usage
* vk_pipeline_cache: Fix invalid iterator
* image_view: Reduce log spam when alpha=1 in storage swizzle
* video_core: More features and proper spirv feature detection
* video_core: Attempt no2 for specialization
* spirv: Remove conflict
* vk_shader_cache: Small cleanup
* translator: Implemtn f32 to f16 convert
* shader_recompiler: Add bit instructions
* shader_recompiler: More data share instructions
* shader_recompiler: Remove exec contexts, fix S_MOV_B64
* shader_recompiler: Split instruction parsing into categories
* shader_recompiler: Better BFS search
* shader_recompiler: Constant propagation pass for cmp_class_f32
* shader_recompiler: Partial readfirstlane implementation
* shader_recompiler: Stub readlane/writelane only for non-compute
* hack: Fix swizzle on RDR
* Will properly fix this when merging this
* clang format
* address_space: Bump user area size to full
* shader_recompiler: V_INTERP_MOV_F32
* Should work the same as spirv will emit flat decoration on demand
* kernel: Add MAP_OP_MAP_FLEXIBLE
* image_view: Attempt to apply storage swizzle on format
* vk_scheduler: Barrier attachments on renderpass end
* clang format
* liverpool: cs state backup
* shader_recompiler: More instructions and formats
* vector_alu: Proper V_MBCNT_U32_B32
* shader_recompiler: Port some dark souls things
* file_system: Implement sceKernelRename
* more formats
* clang format
* resource_tracking_pass: Back to assert
* translate: Tracedata
* kernel: Remove tracy lock
* Solves random crashes in Dark Souls
* code: Review comments
* vk_rasterizer: Clear depth buffer when DB_RENDER_CONTROL says so
* video_core: Preliminary storage image support, more opcodes
* renderer_vulkan: a fix for vertex buffers merging
* renderer_vulkan: a heuristic for blend override when alpha out is masked
---------
Co-authored-by: psucien <bad_cast@protonmail.com>